cxoijve commited on
Commit
df45f8f
Β·
1 Parent(s): 68b954c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -5
README.md CHANGED
@@ -11,9 +11,25 @@ base_model: meta-llama/Llama-2-7b-chat-hf
11
  - NSMC의 train μŠ€ν”Œλ¦Ώ μƒμœ„ 2,000개 μ΄μƒμ˜ μƒ˜ν”Œμ„ ν•™μŠ΅μ— μ‚¬μš©
12
  - test μŠ€ν”Œλ¦Ώ μƒμœ„ 1,000개의 μƒ˜ν”Œλ§Œ μΈ‘μ •
13
 
14
-
15
-
16
- ## Training Results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  TrainOutput(global_step=1600, training_loss=0.7892872190475464,
19
  metrics={'train_runtime': 5825.2445, 'train_samples_per_second': 0.549,
@@ -21,7 +37,7 @@ metrics={'train_runtime': 5825.2445, 'train_samples_per_second': 0.549,
21
  'train_loss': 0.7892872190475464, 'epoch': 1.6})
22
 
23
 
24
- #### Accuracy
25
 
26
  Llama2: 정확도 0.52
27
 
@@ -32,6 +48,6 @@ Llama2: 정확도 0.52
32
 
33
 
34
 
35
- ## Model Card Authors
36
 
37
  cxoijve
 
11
  - NSMC의 train μŠ€ν”Œλ¦Ώ μƒμœ„ 2,000개 μ΄μƒμ˜ μƒ˜ν”Œμ„ ν•™μŠ΅μ— μ‚¬μš©
12
  - test μŠ€ν”Œλ¦Ώ μƒμœ„ 1,000개의 μƒ˜ν”Œλ§Œ μΈ‘μ •
13
 
14
+ ### Training procedure
15
+
16
+ ### Training hyperparameters
17
+
18
+ The following hyperparameters were used during training:
19
+ - learning_rate: 0.0001
20
+ - train_batch_size: 1
21
+ - eval_batch_size: 1
22
+ - seed: 42
23
+ - gradient_accumulation_steps: 2
24
+ - total_train_batch_size: 2
25
+ - optimizer: adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08,
26
+ - lr_scheduler_type: cosine
27
+ - lr_scheduler_warmup_ratio: 0.03
28
+ - training_args.logging_steps: 100
29
+ - training_args.max_steps : 1600
30
+ - trainable params: 19,988,480 || all params: 6,758,404,096 || trainable%: 0.2957573965106688
31
+
32
+ ### Training Results
33
 
34
  TrainOutput(global_step=1600, training_loss=0.7892872190475464,
35
  metrics={'train_runtime': 5825.2445, 'train_samples_per_second': 0.549,
 
37
  'train_loss': 0.7892872190475464, 'epoch': 1.6})
38
 
39
 
40
+ ### Accuracy
41
 
42
  Llama2: 정확도 0.52
43
 
 
48
 
49
 
50
 
51
+ ### Model Card Authors
52
 
53
  cxoijve