Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1 |
---
|
2 |
license: llama3
|
3 |
-
base_model:
|
4 |
tags:
|
5 |
- alignment-handbook
|
6 |
- axolotl
|
@@ -34,7 +34,7 @@ Codes: [https://github.com/magpie-align/magpie](https://github.com/magpie-align/
|
|
34 |
## Model Overview
|
35 |
|
36 |
This model is an aligned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B). We apply the following pipeline:
|
37 |
-
- We first use [Magpie-Align/Magpie-Pro-MT-300K-v0.1](https://huggingface.co/datasets/Magpie-Align/Magpie-Pro-MT-300K-v0.1) dataset and perform SFT -> [Magpie-Align/Llama-3-8B-Magpie-
|
38 |
- We then perform DPO on the [princeton-nlp/llama3-ultrafeedback](https://huggingface.co/datasets/princeton-nlp/llama3-ultrafeedback) dataset.
|
39 |
|
40 |
The overall performance is even better than the official Llama-3-8B-Instruct Model!
|
|
|
1 |
---
|
2 |
license: llama3
|
3 |
+
base_model: Magpie-Align/Llama-3-8B-Magpie-Align-SFT-v0.1
|
4 |
tags:
|
5 |
- alignment-handbook
|
6 |
- axolotl
|
|
|
34 |
## Model Overview
|
35 |
|
36 |
This model is an aligned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B). We apply the following pipeline:
|
37 |
+
- We first use [Magpie-Align/Magpie-Pro-MT-300K-v0.1](https://huggingface.co/datasets/Magpie-Align/Magpie-Pro-MT-300K-v0.1) dataset and perform SFT -> [Magpie-Align/Llama-3-8B-Magpie-Align-SFT-v0.1](https://huggingface.co/Magpie-Align/Llama-3-8B-Magpie-Align-SFT-v0.1)
|
38 |
- We then perform DPO on the [princeton-nlp/llama3-ultrafeedback](https://huggingface.co/datasets/princeton-nlp/llama3-ultrafeedback) dataset.
|
39 |
|
40 |
The overall performance is even better than the official Llama-3-8B-Instruct Model!
|