ashishtanwer
's Collections
Dataset
updated
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
with Web Data, and Web Data Only
Paper
•
2306.01116
•
Published
•
32
Viewer
•
Updated
•
48.6B
•
169k
•
1.81k
Viewer
•
Updated
•
968M
•
24.7k
•
826
Preview
•
Updated
•
38.3k
•
443
LLaMA: Open and Efficient Foundation Language Models
Paper
•
2302.13971
•
Published
•
13
mosaicml/mpt-7b
Text Generation
•
Updated
•
18.7k
•
1.16k
togethercomputer/RedPajama-Data-V2
Updated
•
1.98k
•
356
stepfun-ai/GOT-OCR2_0
Image-Text-to-Text
•
Updated
•
798k
•
1.32k
Focus Anywhere for Fine-grained Multi-page Document Understanding
Paper
•
2405.14295
•
Published
•
1
Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Paper
•
2312.06109
•
Published
•
20
💬
GOT Online
llava-hf/llava-1.5-7b-hf
Image-Text-to-Text
•
Updated
•
400k
•
220
microsoft/OmniParser
Image-Text-to-Text
•
Updated
•
1.46k
•
1.52k
ColPali: Efficient Document Retrieval with Vision Language Models
Paper
•
2407.01449
•
Published
•
42
InternLM-XComposer-2.5: A Versatile Large Vision Language Model
Supporting Long-Contextual Input and Output
Paper
•
2407.03320
•
Published
•
93
Viewer
•
Updated
•
1.45M
•
17k
•
173