kuotient commited on
Commit
a63b5a6
ยท
verified ยท
1 Parent(s): 5878dac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -6
README.md CHANGED
@@ -20,12 +20,20 @@ tags:
20
  - merge
21
 
22
  ---
23
- # EEVE-Math-10.8B-SFT
24
- ์ด ๋ชจ๋ธ์€ [Orca-Math: Unlocking the potential of SLMs in Grade School Math](https://arxiv.org/pdf/2402.14830.pdf)๋ฐ [DARE](https://arxiv.org/abs/2311.03099)์˜ ๊ฐœ๋…๊ณผ ์ด๋ฅผ ํ™œ์šฉํ•œ ๋‚ด์šฉ์„ ํฌํ•จํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
 
 
 
 
 
 
 
 
25
 
26
  | Model | gsm8k-ko(pass@1) |
27
  |---|---|
28
- | Base | 0.4049 |
29
  | [EEVE-Math](https://huggingface.co/kuotient/EEVE-Math-10.8B) (epoch 1) | 0.508 |
30
  | EEVE-Math (epoch 2) | **0.539** |
31
  | [EEVE-Instruct](https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0) | 0.4511 |
@@ -38,7 +46,7 @@ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](
38
  ### Models Merged
39
 
40
  The following models were included in the merge:
41
- * [kuotient/EEVE-Math-10.8B-SFT](https://huggingface.co/kuotient/EEVE-Math-10.8B-SFT)
42
 
43
  ### Configuration
44
 
@@ -48,7 +56,7 @@ The following YAML configuration was used to produce this model:
48
  models:
49
  - model: yanolja/EEVE-Korean-Instruct-10.8B-v1.0
50
  # no parameters necessary for base model
51
- - model: kuotient/EEVE-Math-10.8B-SFT
52
  parameters:
53
  density: 1
54
  weight: 0.6
@@ -57,7 +65,6 @@ base_model: yanolja/EEVE-Korean-Instruct-10.8B-v1.0
57
  parameters:
58
  int8_mask: true
59
  dtype: bfloat16
60
-
61
  ```
62
 
63
  ## Evaluation
 
20
  - merge
21
 
22
  ---
23
+ # EEVE-Instruct-Math-10.8B
24
+
25
+ `EEVE-Math` ํ”„๋กœ์ ํŠธ๋Š”
26
+ - Orca-Math-200k ๋ฒˆ์—ญ ([Orca-Math: Unlocking the potential of SLMs in Grade School Math](https://arxiv.org/pdf/2402.14830.pdf))
27
+ - gsm8k ๋ฒˆ์—ญ, lm_eval ํ™œ์šฉ
28
+ - Mergekit์„ ์ด์šฉํ•œ dare-ties ์‚ฌ์šฉ ([DARE](https://arxiv.org/abs/2311.03099))
29
+
30
+ ์— ๋Œ€ํ•œ ๋‚ด์šฉ์„ ํฌ๊ด„ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
31
+
32
+ > ์ด ๋ชจ๋ธ์€ EEVE-Math์™€ EEVE-Instruct์˜ dare-ties๋กœ ๋ณ‘ํ•ฉํ•œ ๋ณ‘ํ•ฉ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ด ํ”„๋กœ์ ํŠธ๋Š” ์ด๋Ÿฐ ๊ณผ์ •์„ ํ†ตํ•ด ํŠนํ™” ๋ชจ๋ธ์˜ EEVE-Math์˜ ์„ฑ๋Šฅ์„ ๋งŽ์ด ์žƒ์ง€ ์•Š๊ณ  Instruct ๋ชจ๋ธ์˜ ์‚ฌ์šฉ์„ฑ์„ ์œ ์ง€ํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ฃผ๋Š” Proof of concept์˜ ์„ฑ๊ฒฉ์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
33
 
34
  | Model | gsm8k-ko(pass@1) |
35
  |---|---|
36
+ | EEVE(Base) | 0.4049 |
37
  | [EEVE-Math](https://huggingface.co/kuotient/EEVE-Math-10.8B) (epoch 1) | 0.508 |
38
  | EEVE-Math (epoch 2) | **0.539** |
39
  | [EEVE-Instruct](https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0) | 0.4511 |
 
46
  ### Models Merged
47
 
48
  The following models were included in the merge:
49
+ * [kuotient/EEVE-Math-10.8B](https://huggingface.co/kuotient/EEVE-Math-10.8B)
50
 
51
  ### Configuration
52
 
 
56
  models:
57
  - model: yanolja/EEVE-Korean-Instruct-10.8B-v1.0
58
  # no parameters necessary for base model
59
+ - model: kuotient/EEVE-Math-10.8B
60
  parameters:
61
  density: 1
62
  weight: 0.6
 
65
  parameters:
66
  int8_mask: true
67
  dtype: bfloat16
 
68
  ```
69
 
70
  ## Evaluation