argilla
/

notux-8x7b-v1

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

alvarobartt HF staff commited on Dec 28, 2023

Commit

1f85620

·

1 Parent(s): f6425c3

Update README.md

Files changed (1) hide show

README.md +8 -5

README.md CHANGED Viewed

@@ -3,6 +3,10 @@ datasets:
 - argilla/ultrafeedback-binarized-preferences-cleaned
 language:
 - en
 base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
 library_name: transformers
 pipeline_tag: text-generation
@@ -34,10 +38,10 @@ This is part of the Notus family of models and experiments, where the Argilla te
 ### Model Description
-- **Developed by:** Argilla (based on HuggingFace H4 and MistralAI previous efforts)
 - **Shared by:** Argilla
 - **Model type:** Pretrained generative Sparse Mixture of Experts
-- **Language(s) (NLP):** Mainly English
 - **License:** MIT
 - **Finetuned from model:** [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
@@ -50,13 +54,12 @@ This is part of the Notus family of models and experiments, where the Argilla te
 ### Training Hardware
-We used a VM with 8 x H100 80GB hosted in runpod.io for 1 epoch (~10hr)
 ### Training Data
 We used a new iteration of the Argilla UltraFeedback preferences dataset named [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned).
 ## Training procedure
 ### Training hyperparameters
@@ -90,4 +93,4 @@ The following hyperparameters were used during training:
 - Transformers 4.36.0
 - Pytorch 2.1.0+cu118
 - Datasets 2.14.6
-- Tokenizers 0.15.0

 - argilla/ultrafeedback-binarized-preferences-cleaned
 language:
 - en
+- de
+- es
+- fr
+- it
 base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
 library_name: transformers
 pipeline_tag: text-generation
 ### Model Description
+- **Developed by:** Argilla (based on MistralAI previous efforts)
 - **Shared by:** Argilla
 - **Model type:** Pretrained generative Sparse Mixture of Experts
+- **Language(s) (NLP):** English, Spanish, Italian, German, and French
 - **License:** MIT
 - **Finetuned from model:** [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
 ### Training Hardware
+We used a VM with 8 x H100 80GB hosted in runpod.io for 1 epoch (~10hr).
 ### Training Data
 We used a new iteration of the Argilla UltraFeedback preferences dataset named [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned).
 ## Training procedure
 ### Training hyperparameters
 - Transformers 4.36.0
 - Pytorch 2.1.0+cu118
 - Datasets 2.14.6
+- Tokenizers 0.15.0