qnguyen3 commited on
Commit
04ed364
·
verified ·
1 Parent(s): a88a538

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -38,7 +38,7 @@ widget:
38
  - text: In the context of computer programming, an algorithm is
39
  example_title: Algorithm Definition
40
  ---
41
- # Mixsmol-4x400M-v0.1
42
  This is the first checkpoint (Epoch 1) of Mixsmol-4x400M-v0.1
43
  Note that this is an experimental in data mixing. Therefore, we only trained the model on 50B tokens (95% English and 5% Vietnamese) to test the following:
44
  - Reasoining capabilities through high-quality synthetic textbooks data pretraining
@@ -71,3 +71,6 @@ After verifying our hypothesis with this run, we will schedule a second run on b
71
  |truthfulqa_mc2|Yaml |none | 0|acc |0.3909|± |0.0148|
72
  |winogrande|Yaml |none | 5|acc |0.5107|± | 0.014|
73
  |gsm8k|Yaml |get-answer| 5|exact_match| 0|± | 0|
 
 
 
 
38
  - text: In the context of computer programming, an algorithm is
39
  example_title: Algorithm Definition
40
  ---
41
+ # Mixsmol-4x400M-v0.1 by Ontocord
42
  This is the first checkpoint (Epoch 1) of Mixsmol-4x400M-v0.1
43
  Note that this is an experimental in data mixing. Therefore, we only trained the model on 50B tokens (95% English and 5% Vietnamese) to test the following:
44
  - Reasoining capabilities through high-quality synthetic textbooks data pretraining
 
71
  |truthfulqa_mc2|Yaml |none | 0|acc |0.3909|± |0.0148|
72
  |winogrande|Yaml |none | 5|acc |0.5107|± | 0.014|
73
  |gsm8k|Yaml |get-answer| 5|exact_match| 0|± | 0|
74
+
75
+ ## Contribution
76
+ This work is a shared contribution between **Ontocord, BEE-spoke-data and VILM**