davidbrandfonbrener commited on
Commit
9631c72
·
verified ·
1 Parent(s): 34cfc77

add model card

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model description
2
+
3
+ This repo contains over 500 model checkpoints ranging in size from 20M parameters up to 3.3B parameters and FLOP budgets from 2e17 to 1e21 FLOPs across 6 different pretraining datasets.
4
+
5
+ Each subdirectory name contains four different parameters to identify the model in that subdirectory:
6
+
7
+ - Dataset: one of `fineweb-100b`, `fineweb-edu-100b`, `proof-pile-2`, `slimpajama-chunk1`, `smollm-corpus`, or `starcoder`
8
+ - N: the number of model parameters
9
+ - D: the number of training tokens
10
+ - C: the number of training FLOPs
11
+
12
+ For example, a model trained on `starcoder` with 1.1e08 parameters on 3.0e08 tokens for a total of 2.0e17 FLOPs would have the name: `L2L_starcoder_N1.1e08_D3.0e08_C2.0e17/`
13
+
14
+ Full training details for the models can be found in the training repo or paper.
15
+
16
+ # How to load a model
17
+
18
+ First, follow the instruction to install our fork of the [OLMo](https://github.com/allenai/OLMo) package from here: https://github.com/KempnerInstitute/loss-to-loss-olmo/tree/main
19
+
20
+ With this installed, you can then use the huggingface hub and transformers to load a model with the following snippet:
21
+ ```python
22
+ from olmo.model import HFMixinOLMo
23
+ from huggingface_hub import snapshot_download
24
+
25
+ tmp_dir = "tmp"
26
+ model_name = "L2L_starcoder_N1.1e08_D3.0e08_C2.0e17"
27
+
28
+ snapshot_download(
29
+ repo_id="KempnerInstituteAI/loss-to-loss",
30
+ allow_patterns=f"{model_name}/*",
31
+ local_dir=tmp_dir,
32
+ )
33
+
34
+ model = HFMixinOLMo.from_pretrained(f"{tmp_dir}/{model_name}")
35
+ ```
36
+
37
+
38
+ # Citation
39
+
40
+ If you use these models in your research, please cite this paper:
41
+
42
+ ```bibtex
43
+ TODO