CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up Paper • 2412.16112 • Published 17 days ago • 21
APOLLO: SGD-like Memory, AdamW-level Performance Paper • 2412.05270 • Published about 1 month ago • 38
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published about 1 month ago • 123
Negative Token Merging: Image-based Adversarial Feature Guidance Paper • 2412.01339 • Published Dec 2, 2024 • 22
ROICtrl: Boosting Instance Control for Visual Generation Paper • 2411.17949 • Published Nov 27, 2024 • 82
Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis Paper • 2411.17769 • Published Nov 26, 2024 • 7
Style-Friendly SNR Sampler for Style-Driven Generation Paper • 2411.14793 • Published Nov 22, 2024 • 36
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models Paper • 2411.05007 • Published Nov 7, 2024 • 16
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Paper • 2409.11355 • Published Sep 17, 2024 • 29
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation Paper • 2409.04410 • Published Sep 6, 2024 • 23
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering Paper • 2406.10208 • Published Jun 14, 2024 • 21
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels Paper • 2406.09415 • Published Jun 13, 2024 • 50