0-hero commited on
Commit
e75d63e
·
verified ·
1 Parent(s): 38adf30

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - allenai/dolma
5
+ ---
6
+ # Training run to compare Mixture-of-Depths, Bitnet
7
+ [Wandb Report](https://api.wandb.ai/links/tulasiram/pw76q41i)
8
+
9
+ ![image/png"](https://cdn-uploads.huggingface.co/production/uploads/6382255fcae34727b9cc149e/-ovvzj0ZvzuArH0cdOz8b.png)
10
+
11
+ #### 4 Models trained for 100k steps on Dolma
12
+ - OLMo-50M - 50M parameter model
13
+ - OLMo-50M-bitlinear - 50M parameter bitnet model
14
+ - OLMo-50M-mod - 50M parameter mixture-of-depths model
15
+ - OLMo-50M-mod-bitlinear - 50M parameter mixture-of-depths bitnet model
16
+
17
+ Repo has zip files which include training states and other files for each model. I am not the author of the mixture-of-depths implementation, it can be found [here](https://github.com/thepowerfuldeez/OLMo)