datajuicer
/

LLaMA-1B-dj-refine-100B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

yxdyc commited on Oct 30, 2023

Commit

c3b68d6

·

1 Parent(s): d37df3a

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -27,4 +27,6 @@ The model architecture is LLaMA-1.3B and we adopt the [OpenLLaMA](https://github
 The model is pre-trained on 100B tokens of Data-Juicer's refined RedPajama and Pile.
 It achieves an average score of 33.07 over 16 HELM tasks, beating LLMs trained on original RedPajama and Pile datasets.
-For more details, please refer to our [paper](https://arxiv.org/abs/2309.02033).

 The model is pre-trained on 100B tokens of Data-Juicer's refined RedPajama and Pile.
 It achieves an average score of 33.07 over 16 HELM tasks, beating LLMs trained on original RedPajama and Pile datasets.
+For more details, please refer to our [paper](https://arxiv.org/abs/2309.02033).
+![exp_llama](https://img.alicdn.com/imgextra/i2/O1CN019WtUPP1uhebnDlPR8_!!6000000006069-2-tps-2530-1005.png)