File size: 2,000 Bytes
f81d701
 
fd4d9bc
 
 
 
 
 
 
 
 
 
 
f81d701
 
fd4d9bc
f81d701
fd4d9bc
f81d701
fd4d9bc
f81d701
fd4d9bc
f81d701
 
 
 
 
fd4d9bc
 
 
 
f81d701
 
 
 
 
fd4d9bc
f81d701
 
 
fd4d9bc
170f284
fd4d9bc
170f284
fd4d9bc
170f284
fd4d9bc
 
170f284
fd4d9bc
 
 
170f284
 
fd4d9bc
170f284
fd4d9bc
170f284
fd4d9bc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f81d701
fd4d9bc
f81d701
fd4d9bc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
library_name: transformers
tags:
- llm
- Large Language Model
- llama3
- ORPO
- ORPO β
license: apache-2.0
datasets:
- heegyu/hh-rlhf-ko
language:
- ko
---

# Model Card for llama3-8b-instruct-orpo-ko

## Model Summary

This model is a fine-tuned version of the meta-llama/Meta-Llama-3-8B-Instruct using the [odds ratio preference optimization (ORPO)](https://arxiv.org/abs/2403.07691). 

It has been trained to perform NLP tasks in Korean.

## Model Details

### Model Description

- **Developed by:** Sungjoo Byun (Grace Byun)
- **Language(s) (NLP):** Korean
- **License:** Apache 2.0
- **Finetuned from model:** meta-llama/Meta-Llama-3-8B-Instruct

## Training Details

### Training Data

The model was trained using the dataset [heegyu/hh-rlhf-ko](https://huggingface.co/datasets/heegyu/hh-rlhf-ko). We appreciate heegyu for sharing this valuable resource.

### Training Procedure

We applied ORPO β to llama3-8b-instruct. The training was conducted on an A100 GPU with 80GB of memory.

## How to Get Started with the Model

Use the code below to get started with the model:

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("SungJoo/llama3-8b-instruct-orpo-ko")
model = AutoModelForCausalLM.from_pretrained("SungJoo/llama3-8b-instruct-orpo-ko")
```


## Citations

Please cite the ORPO paper and our model as follows:

```bibtex
@misc{hong2024orpo,
      title={ORPO: Monolithic Preference Optimization without Reference Model}, 
      author={Jiwoo Hong and Noah Lee and James Thorne},
      year={2024},
      eprint={2403.07691},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```
```bibtex
@misc{byun,
  author = {Sungjoo Byun},
  title = {llama3-8b-orpo-ko},
  year = {2024},
  publisher = {Hugging Face},
  journal = {Hugging Face repository},
  howpublished = {\url{https://huggingface.co/SungJoo/llama3-8b-instruct-orpo-ko}}
}
```

## Contact

For any questions or issues, please contact [email protected].