Mistral 7B Zephyr Orpo
The Zephyr Orpo recipe applied on top of Mistral 7B v0.2 (new recipe with new Mistral base model)
Model description
- Model type: A 7.2B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
- Language(s) (NLP): Primarily English
- Finetuned from model: wandb/Mistral-7B-v0.2
Recipe
We trained using the alignment handbook recipe and logging to W&B
Visit the W&B workspace here
Results:
- MT bench
########## First turn ##########
score
model turn
zephyr-orpo-7b-v0.2 1 7.44375
########## Second turn ##########
score
model turn
zephyr-orpo-7b-v0.2 2 6.875
########## Average ##########
score
model
zephyr-orpo-7b-v0.2 7.159375
Trained on a single H100 for 2 hours!
- Downloads last month
- 18
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.