zzzac
's Collections
TORead
updated
MegaScale: Scaling Large Language Model Training to More Than 10,000
GPUs
Paper
•
2402.15627
•
Published
•
34
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Paper
•
2402.16822
•
Published
•
15
FuseChat: Knowledge Fusion of Chat Models
Paper
•
2402.16107
•
Published
•
36
Multi-LoRA Composition for Image Generation
Paper
•
2402.16843
•
Published
•
28
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with
Audio2Video Diffusion Model under Weak Conditions
Paper
•
2402.17485
•
Published
•
190
Evaluating Very Long-Term Conversational Memory of LLM Agents
Paper
•
2402.17753
•
Published
•
18
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
605
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper
•
2310.11453
•
Published
•
96
V3D: Video Diffusion Models are Effective 3D Generators
Paper
•
2403.06738
•
Published
•
28
Stealing Part of a Production Language Model
Paper
•
2403.06634
•
Published
•
90
Algorithmic progress in language models
Paper
•
2403.05812
•
Published
•
18
Chronos: Learning the Language of Time Series
Paper
•
2403.07815
•
Published
•
46
Motion Mamba: Efficient and Long Sequence Motion Generation with
Hierarchical and Bidirectional Selective SSM
Paper
•
2403.07487
•
Published
•
13
FDGaussian: Fast Gaussian Splatting from Single Image via
Geometric-aware Diffusion Model
Paper
•
2403.10242
•
Published
•
10
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
•
2403.10704
•
Published
•
57
E5-V: Universal Embeddings with Multimodal Large Language Models
Paper
•
2407.12580
•
Published
•
39
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge
Bases
Paper
•
2407.12784
•
Published
•
48
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language
Models
Paper
•
2407.12327
•
Published
•
77
PaliGemma: A versatile 3B VLM for transfer
Paper
•
2407.07726
•
Published
•
68
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large
Multimodal Models
Paper
•
2407.07895
•
Published
•
40
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
Paper
•
2408.15237
•
Published
•
38
Diffusion Models Are Real-Time Game Engines
Paper
•
2408.14837
•
Published
•
121
Writing in the Margins: Better Inference Pattern for Long Context
Retrieval
Paper
•
2408.14906
•
Published
•
138
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its
Teacher
Paper
•
2408.14176
•
Published
•
61
Building and better understanding vision-language models: insights and
future directions
Paper
•
2408.12637
•
Published
•
124
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Paper
•
2408.10188
•
Published
•
51
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Paper
•
2408.07055
•
Published
•
65
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
Paper
•
2408.05147
•
Published
•
38
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper
•
2408.04619
•
Published
•
155
LLaVA-OneVision: Easy Visual Task Transfer
Paper
•
2408.03326
•
Published
•
59
Language Model Can Listen While Speaking
Paper
•
2408.02622
•
Published
•
37
OpenDevin: An Open Platform for AI Software Developers as Generalist
Agents
Paper
•
2407.16741
•
Published
•
68
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio
Language Modeling
Paper
•
2408.16532
•
Published
•
47
Law of Vision Representation in MLLMs
Paper
•
2408.16357
•
Published
•
92
NVLM: Open Frontier-Class Multimodal LLMs
Paper
•
2409.11402
•
Published
•
72