zzfive
's Collections
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image
Synthesis
Paper
•
2401.09048
•
Published
•
9
Improving fine-grained understanding in image-text pre-training
Paper
•
2401.09865
•
Published
•
16
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper
•
2401.10891
•
Published
•
60
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic
Image Restoration In the Wild
Paper
•
2401.13627
•
Published
•
73
UNIMO-G: Unified Image Generation through Multimodal Conditional
Diffusion
Paper
•
2401.13388
•
Published
•
11
DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image
Editing
Paper
•
2402.02583
•
Published
•
7
SDXL-Lightning: Progressive Adversarial Diffusion Distillation
Paper
•
2402.13929
•
Published
•
27
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with
Trajectory Stitching
Paper
•
2402.14167
•
Published
•
10
Subobject-level Image Tokenization
Paper
•
2402.14327
•
Published
•
17
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept
Composition
Paper
•
2402.15504
•
Published
•
21
Multi-LoRA Composition for Image Generation
Paper
•
2402.16843
•
Published
•
28
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with
Audio2Video Diffusion Model under Weak Conditions
Paper
•
2402.17485
•
Published
•
190
DistriFusion: Distributed Parallel Inference for High-Resolution
Diffusion Models
Paper
•
2402.19481
•
Published
•
20
Trajectory Consistency Distillation
Paper
•
2402.19159
•
Published
•
14
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain
Text-to-Image Customization
Paper
•
2403.00483
•
Published
•
13
ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models
Paper
•
2403.02084
•
Published
•
14
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable
Virtual Try-on
Paper
•
2403.01779
•
Published
•
28
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Paper
•
2403.03206
•
Published
•
60
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Paper
•
2403.05135
•
Published
•
42
Motion Mamba: Efficient and Long Sequence Motion Generation with
Hierarchical and Bidirectional Selective SSM
Paper
•
2403.07487
•
Published
•
13
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering
Paper
•
2403.09622
•
Published
•
16
StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based
Semantic Control
Paper
•
2403.09055
•
Published
•
24
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of
Text-to-Image Models
Paper
•
2403.13535
•
Published
•
22
DepthFM: Fast Monocular Depth Estimation with Flow Matching
Paper
•
2403.13788
•
Published
•
17
Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos
Paper
•
2403.13044
•
Published
•
15
FlashFace: Human Image Personalization with High-fidelity Identity
Preservation
Paper
•
2403.17008
•
Published
•
19
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
Paper
•
2403.16627
•
Published
•
20
ViTAR: Vision Transformer with Any Resolution
Paper
•
2403.18361
•
Published
•
52
ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object
Removal and Insertion
Paper
•
2403.18818
•
Published
•
24
CosmicMan: A Text-to-Image Foundation Model for Humans
Paper
•
2404.01294
•
Published
•
15
Condition-Aware Neural Network for Controlled Image Generation
Paper
•
2404.01143
•
Published
•
11
Measuring Style Similarity in Diffusion Models
Paper
•
2404.01292
•
Published
•
16
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept
Matching
Paper
•
2404.03653
•
Published
•
33
RL for Consistency Models: Faster Reward Guided Text-to-Image Generation
Paper
•
2404.03673
•
Published
•
14
ControlNet++: Improving Conditional Controls with Efficient Consistency
Feedback
Paper
•
2404.07987
•
Published
•
47
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and
Training Strategies
Paper
•
2404.08197
•
Published
•
27
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse
Controls to Any Diffusion Model
Paper
•
2404.09967
•
Published
•
20
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
Paper
•
2404.09990
•
Published
•
12
Dynamic Typography: Bringing Words to Life
Paper
•
2404.11614
•
Published
•
44
MoA: Mixture-of-Attention for Subject-Context Disentanglement in
Personalized Image Generation
Paper
•
2404.11565
•
Published
•
14
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image
Synthesis
Paper
•
2404.13686
•
Published
•
27
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Paper
•
2404.14507
•
Published
•
21
PuLID: Pure and Lightning ID Customization via Contrastive Alignment
Paper
•
2404.16022
•
Published
•
22
Editable Image Elements for Controllable Synthesis
Paper
•
2404.16029
•
Published
•
10
ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with
Reward Feedback Learning
Paper
•
2404.15449
•
Published
•
11
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity
Preserving
Paper
•
2404.16771
•
Published
•
16
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video
Generation
Paper
•
2405.01434
•
Published
•
52
Customizing Text-to-Image Models with a Single Image Pair
Paper
•
2405.01536
•
Published
•
18
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and
Attribute Control
Paper
•
2405.12970
•
Published
•
22
RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
Paper
•
2405.14677
•
Published
•
9
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis
Paper
•
2405.14224
•
Published
•
13
Semantica: An Adaptable Image-Conditioned Diffusion Model
Paper
•
2405.14857
•
Published
•
8
EM Distillation for One-step Diffusion Models
Paper
•
2405.16852
•
Published
•
11
Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models
Paper
•
2405.16759
•
Published
•
7
Paper
•
2405.18407
•
Published
•
46
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Paper
•
2406.04333
•
Published
•
36
pOps: Photo-Inspired Diffusion Operators
Paper
•
2406.01300
•
Published
•
16
Zero-shot Image Editing with Reference Imitation
Paper
•
2406.07547
•
Published
•
31
An Image is Worth 32 Tokens for Reconstruction and Generation
Paper
•
2406.07550
•
Published
•
56
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
Paper
•
2406.06911
•
Published
•
10
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent
Font Effect Generation
Paper
•
2406.08392
•
Published
•
18
Paper
•
2406.09414
•
Published
•
95
An Image is Worth More Than 16x16 Patches: Exploring Transformers on
Individual Pixels
Paper
•
2406.09415
•
Published
•
50
Alleviating Distortion in Image Generation via Multi-Resolution
Diffusion Models
Paper
•
2406.09416
•
Published
•
27
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal
Prompts
Paper
•
2406.09162
•
Published
•
13
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual
Visual Text Rendering
Paper
•
2406.10208
•
Published
•
21
Exploring the Role of Large Language Models in Prompt Encoding for
Diffusion Models
Paper
•
2406.11831
•
Published
•
21
The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN
Inversion and High Quality Image Editing
Paper
•
2406.10601
•
Published
•
65
Invertible Consistency Distillation for Text-Guided Image Editing in
Around 7 Steps
Paper
•
2406.14539
•
Published
•
26
DreamBench++: A Human-Aligned Benchmark for Personalized Image
Generation
Paper
•
2406.16855
•
Published
•
54
Aligning Diffusion Models with Noise-Conditioned Perception
Paper
•
2406.17636
•
Published
•
26
Magic Insert: Style-Aware Drag-and-Drop
Paper
•
2407.02489
•
Published
•
20
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents
Paper
•
2407.03300
•
Published
•
11
PartCraft: Crafting Creative Objects by Parts
Paper
•
2407.04604
•
Published
•
4
SVGCraft: Beyond Single Object Text-to-SVG Synthesis with Comprehensive
Canvas Layout
Paper
•
2404.00412
•
Published
•
2
DataDream: Few-shot Guided Dataset Generation
Paper
•
2407.10910
•
Published
•
8
Scaling Diffusion Transformers to 16 Billion Parameters
Paper
•
2407.11633
•
Published
•
25
IMAGDressing-v1: Customizable Virtual Dressing
Paper
•
2407.12705
•
Published
•
12
CGB-DM: Content and Graphic Balance Layout Generation with
Transformer-based Diffusion Model
Paper
•
2407.15233
•
Published
•
6
Artist: Aesthetically Controllable Text-Driven Stylization without
Training
Paper
•
2407.15842
•
Published
•
14
Paper
•
2407.15595
•
Published
•
13
ViPer: Visual Personalization of Generative Models via Individual
Preference Learning
Paper
•
2407.17365
•
Published
•
11
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Paper
•
2407.16982
•
Published
•
41
BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular
Depth Estimation
Paper
•
2407.17952
•
Published
•
30
SHIC: Shape-Image Correspondences with no Keypoint Supervision
Paper
•
2407.18907
•
Published
•
41
TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models
Paper
•
2408.00735
•
Published
•
16
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy
Curvature of Attention
Paper
•
2408.00760
•
Published
•
6
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation
with Multimodal Generative Pretraining
Paper
•
2408.02657
•
Published
•
33
ProCreate, Dont Reproduce! Propulsive Energy Diffusion for Creative
Generation
Paper
•
2408.02226
•
Published
•
10
IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning
using Instruct Prompts
Paper
•
2408.03209
•
Published
•
21
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware
Open-domain Visual Storytelling
Paper
•
2408.03695
•
Published
•
12
ControlNeXt: Powerful and Efficient Control for Image and Video
Generation
Paper
•
2408.06070
•
Published
•
53
BRAT: Bonus oRthogonAl Token for Architecture Agnostic Textual Inversion
Paper
•
2408.04785
•
Published
•
7
UniPortrait: A Unified Framework for Identity-Preserving Single- and
Multi-Human Image Personalization
Paper
•
2408.05939
•
Published
•
13
Paper
•
2408.07009
•
Published
•
61
ZePo: Zero-Shot Portrait Stylization with Faster Sampling
Paper
•
2408.05492
•
Published
•
7
Paper
•
2408.07116
•
Published
•
19
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations
Paper
•
2408.08459
•
Published
•
45
TurboEdit: Instant text-based image editing
Paper
•
2408.08332
•
Published
•
19
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering
Paper
•
2408.09702
•
Published
•
10
TraDiffusion: Trajectory-Based Training-Free Image Generation
Paper
•
2408.09739
•
Published
•
8
MegaFusion: Extend Diffusion Models towards Higher-resolution Image
Generation without Further Tuning
Paper
•
2408.11001
•
Published
•
11
The Brittleness of AI-Generated Image Watermarking Techniques: Examining
Their Robustness Against Visual Paraphrasing Attacks
Paper
•
2408.10446
•
Published
•
6
Scalable Autoregressive Image Generation with Mamba
Paper
•
2408.12245
•
Published
•
25
CODE: Confident Ordinary Differential Editing
Paper
•
2408.12418
•
Published
•
4
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its
Teacher
Paper
•
2408.14176
•
Published
•
60
Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image
Generation
Paper
•
2408.14819
•
Published
•
21
Distribution Backtracking Builds A Faster Convergence Trajectory for
One-step Diffusion Distillation
Paper
•
2408.15991
•
Published
•
15
CSGO: Content-Style Composition in Text-to-Image Generation
Paper
•
2408.16766
•
Published
•
17
CoRe: Context-Regularized Text Embedding Learning for Text-to-Image
Personalization
Paper
•
2408.15914
•
Published
•
22
VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion
Transformers
Paper
•
2408.17131
•
Published
•
12
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper
•
2409.02097
•
Published
•
32
Accurate Compression of Text-to-Image Diffusion Models via Vector
Quantization
Paper
•
2409.00492
•
Published
•
12
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free
Real Image Editing
Paper
•
2409.01322
•
Published
•
94
IFAdapter: Instance Feature Control for Grounded Text-to-Image
Generation
Paper
•
2409.08240
•
Published
•
18
InstantDrag: Improving Interactivity in Drag-based Image Editing
Paper
•
2409.08857
•
Published
•
31
StoryMaker: Towards Holistic Consistent Characters in Text-to-image
Generation
Paper
•
2409.12576
•
Published
•
15
Imagine yourself: Tuning-Free Personalized Image Generation
Paper
•
2409.13346
•
Published
•
68
Colorful Diffuse Intrinsic Image Decomposition in the Wild
Paper
•
2409.13690
•
Published
•
13
Improvements to SDXL in NovelAI Diffusion V3
Paper
•
2409.15997
•
Published
•
11
Pixel-Space Post-Training of Latent Diffusion Models
Paper
•
2409.17565
•
Published
•
20
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal
Instruction
Paper
•
2410.04932
•
Published
•
9
Accelerating Auto-regressive Text-to-Image Generation with Training-free
Speculative Jacobi Decoding
Paper
•
2410.01699
•
Published
•
18
IterComp: Iterative Composition-Aware Feedback Learning from Model
Gallery for Text-to-Image Generation
Paper
•
2410.07171
•
Published
•
41
Story-Adapter: A Training-free Iterative Framework for Long Story
Visualization
Paper
•
2410.06244
•
Published
•
19
Eliminating Oversaturation and Artifacts of High Guidance Scales in
Diffusion Models
Paper
•
2410.02416
•
Published
•
26
DICE: Discrete Inversion Enabling Controllable Editing for Multinomial
Diffusion and Masked Generative Models
Paper
•
2410.08207
•
Published
•
18
Meissonic: Revitalizing Masked Generative Transformers for Efficient
High-Resolution Text-to-Image Synthesis
Paper
•
2410.08261
•
Published
•
50
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large
Vision-Language Models
Paper
•
2410.07133
•
Published
•
19
Semantic Image Inversion and Editing using Rectified Stochastic
Differential Equations
Paper
•
2410.10792
•
Published
•
29
Efficient Diffusion Models: A Comprehensive Survey from Principles to
Practices
Paper
•
2410.11795
•
Published
•
17
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Paper
•
2410.11817
•
Published
•
15
Fluid: Scaling Autoregressive Text-to-image Generative Models with
Continuous Tokens
Paper
•
2410.13863
•
Published
•
36
VidPanos: Generative Panoramic Videos from Casual Panning Videos
Paper
•
2410.13832
•
Published
•
12
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion
Model
Paper
•
2410.13925
•
Published
•
23
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved
Visual Representation Capabilities
Paper
•
2410.14672
•
Published
•
7
Scalable Ranked Preference Optimization for Text-to-Image Generation
Paper
•
2410.18013
•
Published
•
14
Stable Consistency Tuning: Understanding and Improving Consistency
Models
Paper
•
2410.18958
•
Published
•
9
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe
Dataset Curation
Paper
•
2410.18666
•
Published
•
19
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse
Autoencoders
Paper
•
2410.22366
•
Published
•
77
Constant Acceleration Flow
Paper
•
2411.00322
•
Published
•
23
In-Context LoRA for Diffusion Transformers
Paper
•
2410.23775
•
Published
•
11
Training-free Regional Prompting for Diffusion Transformers
Paper
•
2411.02395
•
Published
•
25
Constrained Diffusion Implicit Models
Paper
•
2411.00359
•
Published
•
6
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion
Models
Paper
•
2411.05007
•
Published
•
16
Add-it: Training-Free Object Insertion in Images With Pretrained
Diffusion Models
Paper
•
2411.07232
•
Published
•
63
OmniEdit: Building Image Editing Generalist Models Through Specialist
Supervision
Paper
•
2411.07199
•
Published
•
46
Edify Image: High-Quality Image Generation with Pixel Space Laplacian
Diffusion Models
Paper
•
2411.07126
•
Published
•
28
Watermark Anything with Localized Messages
Paper
•
2411.07231
•
Published
•
20
JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified
Multimodal Understanding and Generation
Paper
•
2411.07975
•
Published
•
27
Scaling Properties of Diffusion Models for Perceptual Tasks
Paper
•
2411.08034
•
Published
•
13
MagicQuill: An Intelligent Interactive Image Editing System
Paper
•
2411.09703
•
Published
•
62
Inconsistencies In Consistency Models: Better ODE Solving Does Not Imply
Better Samples
Paper
•
2411.08954
•
Published
•
8
Region-Aware Text-to-Image Generation via Hard Binding and Soft
Refinement
Paper
•
2411.06558
•
Published
•
34
FitDiT: Advancing the Authentic Garment Details for High-fidelity
Virtual Try-on
Paper
•
2411.10499
•
Published
•
13
Continuous Speculative Decoding for Autoregressive Image Generation
Paper
•
2411.11925
•
Published
•
15
Stylecodes: Encoding Stylistic Information For Image Generation
Paper
•
2411.12811
•
Published
•
11
Generating Compositional Scenes via Text-to-image RGBA Instance
Generation
Paper
•
2411.10913
•
Published
•
3
Stable Flow: Vital Layers for Training-Free Image Editing
Paper
•
2411.14430
•
Published
•
14
Style-Friendly SNR Sampler for Style-Driven Generation
Paper
•
2411.14793
•
Published
•
36
OminiControl: Minimal and Universal Control for Diffusion Transformer
Paper
•
2411.15098
•
Published
•
53
MyTimeMachine: Personalized Facial Age Transformation
Paper
•
2411.14521
•
Published
•
20
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot
Subject-Driven Image Generator
Paper
•
2411.15466
•
Published
•
34
One Diffusion to Generate Them All
Paper
•
2411.16318
•
Published
•
26
Controllable Human Image Generation with Personalized Multi-Garments
Paper
•
2411.16801
•
Published
•
3
ROICtrl: Boosting Instance Control for Visual Generation
Paper
•
2411.17949
•
Published
•
82
DreamCache: Finetuning-Free Lightweight Personalized Image Generation
via Feature Caching
Paper
•
2411.17786
•
Published
•
12
Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient
Paper
•
2411.17787
•
Published
•
11
Diffusion Self-Distillation for Zero-Shot Customized Image Generation
Paper
•
2411.18616
•
Published
•
15
Omegance: A Single Parameter for Various Granularities in
Diffusion-Based Synthesis
Paper
•
2411.17769
•
Published
•
7
Edit Away and My Face Will not Stay: Personal Biometric Defense against
Malicious Generative Editing
Paper
•
2411.16832
•
Published
•
2
TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction
using Diffusion Models
Paper
•
2411.18350
•
Published
•
23
FAM Diffusion: Frequency and Attention Modulation for High-Resolution
Image Generation with Stable Diffusion
Paper
•
2411.18552
•
Published
•
17
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis
Paper
•
2412.01819
•
Published
•
33
Art-Free Generative Models: Art Creation Without Graphic Art Knowledge
Paper
•
2412.00176
•
Published
•
8
SNOOPI: Supercharged One-step Diffusion Distillation with Proper
Guidance
Paper
•
2412.02687
•
Published
•
108
TokenFlow: Unified Image Tokenizer for Multimodal Understanding and
Generation
Paper
•
2412.03069
•
Published
•
30
LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene
Relighting
Paper
•
2412.00177
•
Published
•
7
A Noise is Worth Diffusion Guidance
Paper
•
2412.03895
•
Published
•
28
Negative Token Merging: Image-based Adversarial Feature Guidance
Paper
•
2412.01339
•
Published
•
22
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent
Diffusion Models
Paper
•
2412.04146
•
Published
•
22
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution
Image Synthesis
Paper
•
2412.04431
•
Published
•
17
ZipAR: Accelerating Autoregressive Image Generation through Spatial
Locality
Paper
•
2412.04062
•
Published
•
7
SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step
Diffusion
Paper
•
2412.04301
•
Published
•
34
PanoDreamer: 3D Panorama Synthesis from a Single Image
Paper
•
2412.04827
•
Published
•
10
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for
Customized Manga Generation
Paper
•
2412.07589
•
Published
•
46
Hidden in the Noise: Two-Stage Robust Watermarking for Images
Paper
•
2412.04653
•
Published
•
28
FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion
Models
Paper
•
2412.07674
•
Published
•
20
UniReal: Universal Image Generation and Editing via Learning Real-world
Dynamics
Paper
•
2412.07774
•
Published
•
25
LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style
Conditioned Image Generation
Paper
•
2412.05148
•
Published
•
11
Learning Flow Fields in Attention for Controllable Person Image
Generation
Paper
•
2412.08486
•
Published
•
32
FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow
Models
Paper
•
2412.08629
•
Published
•
11
StyleStudio: Text-Driven Style Transfer with Selective Control of Style
Elements
Paper
•
2412.08503
•
Published
•
8
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via
Multimodal LLM
Paper
•
2412.09618
•
Published
•
21
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices
with Efficient Architectures and Training
Paper
•
2412.09619
•
Published
•
20
LoRACLR: Contrastive Adaptation for Customization of Diffusion Models
Paper
•
2412.09622
•
Published
•
7
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free
Scale Fusion
Paper
•
2412.09626
•
Published
•
19
ObjectMate: A Recurrence Prior for Object Insertion and Subject-Driven
Generation
Paper
•
2412.08645
•
Published
•
11
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing
Paper
•
2412.07517
•
Published
•
10
FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers
Paper
•
2412.09611
•
Published
•
9
BrushEdit: All-In-One Image Inpainting and Editing
Paper
•
2412.10316
•
Published
•
33
ColorFlow: Retrieval-Augmented Image Sequence Colorization
Paper
•
2412.11815
•
Published
•
26
Causal Diffusion Transformers for Generative Modeling
Paper
•
2412.12095
•
Published
•
23
FashionComposer: Compositional Fashion Image Generation
Paper
•
2412.14168
•
Published
•
16
ChatDiT: A Training-Free Baseline for Task-Agnostic Free-Form Chatting
with Diffusion Transformers
Paper
•
2412.12571
•
Published
•
8
Flowing from Words to Pixels: A Framework for Cross-Modality Evolution
Paper
•
2412.15213
•
Published
•
25
Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion
Paper
•
2412.14462
•
Published
•
15
Paper
•
2412.18653
•
Published
•
63
The Superposition of Diffusion Models Using the Itô Density Estimator
Paper
•
2412.17762
•
Published
•
12
From Elements to Design: A Layered Approach for Automatic Graphic Design
Composition
Paper
•
2412.19712
•
Published
•
14
VMix: Improving Text-to-Image Diffusion Model with Cross-Attention
Mixing Control
Paper
•
2412.20800
•
Published
•
4