alvarobartt HF staff commited on
Commit
1f85620
·
1 Parent(s): f6425c3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -5
README.md CHANGED
@@ -3,6 +3,10 @@ datasets:
3
  - argilla/ultrafeedback-binarized-preferences-cleaned
4
  language:
5
  - en
 
 
 
 
6
  base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
7
  library_name: transformers
8
  pipeline_tag: text-generation
@@ -34,10 +38,10 @@ This is part of the Notus family of models and experiments, where the Argilla te
34
 
35
  ### Model Description
36
 
37
- - **Developed by:** Argilla (based on HuggingFace H4 and MistralAI previous efforts)
38
  - **Shared by:** Argilla
39
  - **Model type:** Pretrained generative Sparse Mixture of Experts
40
- - **Language(s) (NLP):** Mainly English
41
  - **License:** MIT
42
  - **Finetuned from model:** [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
43
 
@@ -50,13 +54,12 @@ This is part of the Notus family of models and experiments, where the Argilla te
50
 
51
  ### Training Hardware
52
 
53
- We used a VM with 8 x H100 80GB hosted in runpod.io for 1 epoch (~10hr)
54
 
55
  ### Training Data
56
 
57
  We used a new iteration of the Argilla UltraFeedback preferences dataset named [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned).
58
 
59
-
60
  ## Training procedure
61
 
62
  ### Training hyperparameters
@@ -90,4 +93,4 @@ The following hyperparameters were used during training:
90
  - Transformers 4.36.0
91
  - Pytorch 2.1.0+cu118
92
  - Datasets 2.14.6
93
- - Tokenizers 0.15.0
 
3
  - argilla/ultrafeedback-binarized-preferences-cleaned
4
  language:
5
  - en
6
+ - de
7
+ - es
8
+ - fr
9
+ - it
10
  base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
11
  library_name: transformers
12
  pipeline_tag: text-generation
 
38
 
39
  ### Model Description
40
 
41
+ - **Developed by:** Argilla (based on MistralAI previous efforts)
42
  - **Shared by:** Argilla
43
  - **Model type:** Pretrained generative Sparse Mixture of Experts
44
+ - **Language(s) (NLP):** English, Spanish, Italian, German, and French
45
  - **License:** MIT
46
  - **Finetuned from model:** [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
47
 
 
54
 
55
  ### Training Hardware
56
 
57
+ We used a VM with 8 x H100 80GB hosted in runpod.io for 1 epoch (~10hr).
58
 
59
  ### Training Data
60
 
61
  We used a new iteration of the Argilla UltraFeedback preferences dataset named [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned).
62
 
 
63
  ## Training procedure
64
 
65
  ### Training hyperparameters
 
93
  - Transformers 4.36.0
94
  - Pytorch 2.1.0+cu118
95
  - Datasets 2.14.6
96
+ - Tokenizers 0.15.0