kuotient
/

EEVE-Instruct-Math-10.8B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kuotient commited on Mar 28, 2024

Commit

a63b5a6

·

verified ·

1 Parent(s): 5878dac

Update README.md

Files changed (1) hide show

README.md +13 -6

README.md CHANGED Viewed

@@ -20,12 +20,20 @@ tags:
 - merge
 ---
-# EEVE-Math-10.8B-SFT
-이 모델은 [Orca-Math: Unlocking the potential of SLMs in Grade School Math](https://arxiv.org/pdf/2402.14830.pdf)및 [DARE](https://arxiv.org/abs/2311.03099)의 개념과 이를 활용한 내용을 포함하고 있습니다.
 | Model | gsm8k-ko(pass@1) |
 |---|---|
-| Base | 0.4049 |
 | [EEVE-Math](https://huggingface.co/kuotient/EEVE-Math-10.8B) (epoch 1) | 0.508 |
 | EEVE-Math (epoch 2) | **0.539** |
 | [EEVE-Instruct](https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0) | 0.4511 |
@@ -38,7 +46,7 @@ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](
 ### Models Merged
 The following models were included in the merge:
-* [kuotient/EEVE-Math-10.8B-SFT](https://huggingface.co/kuotient/EEVE-Math-10.8B-SFT)
 ### Configuration
@@ -48,7 +56,7 @@ The following YAML configuration was used to produce this model:
 models:
   - model: yanolja/EEVE-Korean-Instruct-10.8B-v1.0
     # no parameters necessary for base model
-  - model: kuotient/EEVE-Math-10.8B-SFT
     parameters:
       density: 1
       weight: 0.6
@@ -57,7 +65,6 @@ base_model: yanolja/EEVE-Korean-Instruct-10.8B-v1.0
 parameters:
   int8_mask: true
 dtype: bfloat16
 ```
 ## Evaluation

 - merge
 ---
+# EEVE-Instruct-Math-10.8B
+`EEVE-Math` 프로젝트는
+- Orca-Math-200k 번역 ([Orca-Math: Unlocking the potential of SLMs in Grade School Math](https://arxiv.org/pdf/2402.14830.pdf))
+- gsm8k 번역, lm_eval 활용
+- Mergekit을 이용한 dare-ties 사용 ([DARE](https://arxiv.org/abs/2311.03099))
+에 대한 내용을 포괄하고 있습니다.
+> 이 모델은 EEVE-Math와 EEVE-Instruct의 dare-ties로 병합한 병합 모델입니다. 이 프로젝트는 이런 과정을 통해 특화 모델의 EEVE-Math의 성능을 많이 잃지 않고 Instruct 모델의 사용성을 유지할 수 있음을 보여주는 Proof of concept의 성격을 가지고 있습니다.
 | Model | gsm8k-ko(pass@1) |
 |---|---|
+| EEVE(Base) | 0.4049 |
 | [EEVE-Math](https://huggingface.co/kuotient/EEVE-Math-10.8B) (epoch 1) | 0.508 |
 | EEVE-Math (epoch 2) | **0.539** |
 | [EEVE-Instruct](https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0) | 0.4511 |
 ### Models Merged
 The following models were included in the merge:
+* [kuotient/EEVE-Math-10.8B](https://huggingface.co/kuotient/EEVE-Math-10.8B)
 ### Configuration
 models:
   - model: yanolja/EEVE-Korean-Instruct-10.8B-v1.0
     # no parameters necessary for base model
+  - model: kuotient/EEVE-Math-10.8B
     parameters:
       density: 1
       weight: 0.6
 parameters:
   int8_mask: true
 dtype: bfloat16
 ```
 ## Evaluation