Llama-2-7b-chat-hf / README.md
cxoijve's picture
Update README.md
f609826
---
library_name: peft
base_model: meta-llama/Llama-2-7b-chat-hf
---
### Model Description
- NSMC 데이터에 λŒ€ν•΄ meta-llama/Llama-2-7b-chat-hf λ―Έμ„ΈνŠœλ‹
- μ˜ν™” 리뷰 ν…μŠ€νŠΈλ₯Ό ν”„λ‘¬ν”„νŠΈμ— ν¬ν•¨ν•˜μ—¬ λͺ¨λΈμ— μž…λ ₯ν•˜λ©΄ '긍정' λ˜λŠ” 'λΆ€μ •'이라고 예츑 ν…μŠ€νŠΈλ₯Ό 직접 생성
- NSMC의 train μŠ€ν”Œλ¦Ώ μƒμœ„ 2,000개 μ΄μƒμ˜ μƒ˜ν”Œμ„ ν•™μŠ΅μ— μ‚¬μš©
- test μŠ€ν”Œλ¦Ώ μƒμœ„ 1,000개의 μƒ˜ν”Œλ§Œ μΈ‘μ •
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 2
- optimizer: adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08,
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- training_args.logging_steps: 100
- training_args.max_steps : 1600
- trainable params: 19,988,480 || all params: 6,758,404,096 || trainable%: 0.2957573965106688
### Training Results
TrainOutput(global_step=1600, training_loss=0.7892872190475464,
metrics={'train_runtime': 5825.2445, 'train_samples_per_second': 0.549,
'train_steps_per_second': 0.275, 'total_flos': 6.51493254365184e+16,
'train_loss': 0.7892872190475464, 'epoch': 1.6})
### Accuracy
Llama2: 정확도 0.52
| | TP | TN |
|---|---|---|
| PP | 192 | 168 |
| PN | 317 | 324 |
정확도λ₯Ό ν–₯μƒμ‹œν‚€κΈ° μœ„ν•΄ μ—¬λŸ¬ μ°¨λ‘€ λ…Έλ ₯을 ν•΄λ³΄μ•˜μ§€λ§Œ λ°˜λ³΅ν•΄μ„œ 였λ₯˜κ°€ λ°œμƒν•˜μ˜€μŠ΅λ‹ˆλ‹€.
### Model Card Authors
cxoijve