license: apache-2.0 | |
language: | |
- fr | |
- it | |
- de | |
- es | |
- en | |
tags: | |
- moe | |
# Model Card for cloudyu/Mixtral-8x7B-Instruct-v0.1-DPO | |
* [try to improve mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) by DPO training | |
* [DPO Trainer](https://huggingface.co/docs/trl/main/en/dpo_trainer) | |
Metrics improved by Truthful DPO traingin after 100 steps | |
![Metrsc improment](mixtral-dpo.jpg) | |