|
--- |
|
library_name: transformers |
|
tags: |
|
- llm |
|
- Large Language Model |
|
- llama3 |
|
- ORPO |
|
- ORPO β |
|
license: apache-2.0 |
|
datasets: |
|
- heegyu/hh-rlhf-ko |
|
language: |
|
- ko |
|
--- |
|
|
|
# Model Card for llama3-8b-instruct-orpo-ko |
|
|
|
## Model Summary |
|
|
|
This model is a fine-tuned version of the meta-llama/Meta-Llama-3-8B-Instruct using the [odds ratio preference optimization (ORPO)](https://arxiv.org/abs/2403.07691). |
|
|
|
It has been trained to perform NLP tasks in Korean. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
- **Developed by:** Sungjoo Byun (Grace Byun) |
|
- **Language(s) (NLP):** Korean |
|
- **License:** Apache 2.0 |
|
- **Finetuned from model:** meta-llama/Meta-Llama-3-8B-Instruct |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
The model was trained using the dataset [heegyu/hh-rlhf-ko](https://huggingface.co/datasets/heegyu/hh-rlhf-ko). We appreciate heegyu for sharing this valuable resource. |
|
|
|
### Training Procedure |
|
|
|
We applied ORPO β to llama3-8b-instruct. The training was conducted on an A100 GPU with 80GB of memory. |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("SungJoo/llama3-8b-instruct-orpo-ko") |
|
model = AutoModelForCausalLM.from_pretrained("SungJoo/llama3-8b-instruct-orpo-ko") |
|
``` |
|
|
|
|
|
## Citations |
|
|
|
Please cite the ORPO paper and our model as follows: |
|
|
|
```bibtex |
|
@misc{hong2024orpo, |
|
title={ORPO: Monolithic Preference Optimization without Reference Model}, |
|
author={Jiwoo Hong and Noah Lee and James Thorne}, |
|
year={2024}, |
|
eprint={2403.07691}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |
|
```bibtex |
|
@misc{byun, |
|
author = {Sungjoo Byun}, |
|
title = {llama3-8b-orpo-ko}, |
|
year = {2024}, |
|
publisher = {Hugging Face}, |
|
journal = {Hugging Face repository}, |
|
howpublished = {\url{https://huggingface.co/SungJoo/llama3-8b-instruct-orpo-ko}} |
|
} |
|
``` |
|
|
|
## Contact |
|
|
|
For any questions or issues, please contact [email protected]. |