SungJoo
/

llama3-8b-instruct-orpo-ko

Text Generation

Large Language Model

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

llama3-8b-instruct-orpo-ko / README.md

SungJoo's picture

Update README.md

fd4d9bc verified 8 months ago

|

history blame contribute delete

2 kB

	---
	library_name: transformers
	tags:
	- llm
	- Large Language Model
	- llama3
	- ORPO
	- ORPO β
	license: apache-2.0
	datasets:
	- heegyu/hh-rlhf-ko
	language:
	- ko
	---

	# Model Card for llama3-8b-instruct-orpo-ko

	## Model Summary

	This model is a fine-tuned version of the meta-llama/Meta-Llama-3-8B-Instruct using the [odds ratio preference optimization (ORPO)](https://arxiv.org/abs/2403.07691).

	It has been trained to perform NLP tasks in Korean.

	## Model Details

	### Model Description

	- Developed by: Sungjoo Byun (Grace Byun)
	- Language(s) (NLP): Korean
	- License: Apache 2.0
	- Finetuned from model: meta-llama/Meta-Llama-3-8B-Instruct

	## Training Details

	### Training Data

	The model was trained using the dataset [heegyu/hh-rlhf-ko](https://huggingface.co/datasets/heegyu/hh-rlhf-ko). We appreciate heegyu for sharing this valuable resource.

	### Training Procedure

	We applied ORPO β to llama3-8b-instruct. The training was conducted on an A100 GPU with 80GB of memory.

	## How to Get Started with the Model

	Use the code below to get started with the model:

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("SungJoo/llama3-8b-instruct-orpo-ko")
	model = AutoModelForCausalLM.from_pretrained("SungJoo/llama3-8b-instruct-orpo-ko")
	```


	## Citations

	Please cite the ORPO paper and our model as follows:

	```bibtex
	@misc{hong2024orpo,
	title={ORPO: Monolithic Preference Optimization without Reference Model},
	author={Jiwoo Hong and Noah Lee and James Thorne},
	year={2024},
	eprint={2403.07691},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```
	```bibtex
	@misc{byun,
	author = {Sungjoo Byun},
	title = {llama3-8b-orpo-ko},
	year = {2024},
	publisher = {Hugging Face},
	journal = {Hugging Face repository},
	howpublished = {\url{https://huggingface.co/SungJoo/llama3-8b-instruct-orpo-ko}}
	}
	```

	## Contact

	For any questions or issues, please contact [email protected].