Ryukijano
's Collections
LLM in a flash: Efficient Large Language Model Inference with Limited
Memory
Paper
•
2312.11514
•
Published
•
257
3D-LFM: Lifting Foundation Model
Paper
•
2312.11894
•
Published
•
13
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective
Depth Up-Scaling
Paper
•
2312.15166
•
Published
•
56
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper
•
2312.16862
•
Published
•
30
LARP: Language-Agent Role Play for Open-World Games
Paper
•
2312.17653
•
Published
•
31
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper
•
2401.01055
•
Published
•
54
tiiuae/falcon-180B
Text Generation
•
Updated
•
4.15k
•
1.13k
meta-llama/Llama-2-70b-hf
Text Generation
•
Updated
•
173k
•
843
TinyLlama: An Open-Source Small Language Model
Paper
•
2401.02385
•
Published
•
90
microsoft/phi-2
Text Generation
•
Updated
•
175k
•
3.26k
🏢
LLaMA Pro 8B Instruct Chat
MoE-Mamba: Efficient Selective State Space Models with Mixture of
Experts
Paper
•
2401.04081
•
Published
•
70
Blending Is All You Need: Cheaper, Better Alternative to
Trillion-Parameters LLM
Paper
•
2401.02994
•
Published
•
49
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper
•
2401.06080
•
Published
•
26
EmbeddedLLM/Mistral-7B-Merge-14-v0.1
Text Generation
•
Updated
•
406
•
24
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Paper
•
2401.12954
•
Published
•
29
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper
•
2401.13601
•
Published
•
45
🏆🤖
Chatbot Arena Leaderboard
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Paper
•
2401.15947
•
Published
•
49
RWKV/v5-Eagle-7B-pth
Updated
•
199
Zyphra/BlackMamba-2.8B
Updated
•
47
•
29
abacusai/Smaug-72B-v0.1
Text Generation
•
Updated
•
2.76k
•
467
CohereForAI/aya-101
Text2Text Generation
•
Updated
•
3.05k
•
625
🚀
Pivot Prompt Demo
SubGen: Token Generation in Sublinear Time and Memory
Paper
•
2402.06082
•
Published
•
10
InternLM-Math: Open Math Large Language Models Toward Verifiable
Reasoning
Paper
•
2402.06332
•
Published
•
18
MPIrigen: MPI Code Generation through Domain-Specific Language Models
Paper
•
2402.09126
•
Published
•
12
BioMistral/BioMistral-7B
Text Generation
•
Updated
•
15.1k
•
408
SaulLM-7B: A pioneering Large Language Model for Law
Paper
•
2403.03883
•
Published
•
77
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
•
2403.10704
•
Published
•
57
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper
•
2403.13372
•
Published
•
62
raincandy-u/Llama-3-Aplite-Instruct-4x8B-MoE
Text Generation
•
Updated
•
340
•
38
nvidia/Llama3-70B-SteerLM-RM
Updated
•
14
•
42
meta-llama/Llama-3.1-405B-FP8
Text Generation
•
Updated
•
701
•
104
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
Text Generation
•
Updated
•
258k
•
1.95k
nvidia/Hymba-1.5B-Base
Text Generation
•
Updated
•
2.73k
•
130
nvidia/Hymba-1.5B-Instruct
Text Generation
•
Updated
•
9.95k
•
215
Qwen/QwQ-32B-Preview
Text Generation
•
Updated
•
105k
•
•
1.49k
meta-llama/Llama-3.3-70B-Instruct
Text Generation
•
Updated
•
388k
•
•
1.44k