arxiv:2412.04468
Yuxian Gu
t1101675
AI & ML interests
Efficient methods for language models
Recent Activity
new activity
3 days ago
MiniLLM/SFT-OPT-1.3B:Difference between SFT and init models
upvoted
a
paper
14 days ago
Byte Latent Transformer: Patches Scale Better Than Tokens
authored
a paper
23 days ago
NVILA: Efficient Frontier Visual Language Models