Efficient-Large-Model/Sana_1600M_1024px_BF16_diffusers Text-to-Image β’ Updated 4 days ago β’ 5.37k β’ 1
LongVILA: Scaling Long-Context Visual Language Models for Long Videos Paper β’ 2408.10188 β’ Published Aug 19, 2024 β’ 51
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation Paper β’ 2409.04429 β’ Published Sep 6, 2024
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers Paper β’ 2410.10629 β’ Published Oct 14, 2024 β’ 9
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training Paper β’ 2410.19313 β’ Published Oct 25, 2024 β’ 19
TinyTL: Reduce Activations, Not Trainable Parameters for Efficient On-Device Learning Paper β’ 2007.11622 β’ Published Jul 22, 2020
NVILA: Efficient Frontier Visual Language Models Paper β’ 2412.04468 β’ Published Dec 5, 2024 β’ 57
Sana Collection β‘οΈSana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer β’ 17 items β’ Updated 17 days ago β’ 67
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper β’ 2412.08737 β’ Published 25 days ago β’ 52