Stalin16
's Collections
Gen AI Diffusion
updated
Animate-X: Universal Character Image Animation with Enhanced Motion
Representation
Paper
•
2410.10306
•
Published
•
54
ReCapture: Generative Video Camera Controls for User-Provided Videos
using Masked Video Fine-Tuning
Paper
•
2411.05003
•
Published
•
70
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for
Image-to-Video Generation
Paper
•
2411.04709
•
Published
•
25
IterComp: Iterative Composition-Aware Feedback Learning from Model
Gallery for Text-to-Image Generation
Paper
•
2410.07171
•
Published
•
41
Story-Adapter: A Training-free Iterative Framework for Long Story
Visualization
Paper
•
2410.06244
•
Published
•
19
How Far is Video Generation from World Model: A Physical Law Perspective
Paper
•
2411.02385
•
Published
•
33
Training-free Regional Prompting for Diffusion Transformers
Paper
•
2411.02395
•
Published
•
25
AutoVFX: Physically Realistic Video Editing from Natural Language
Instructions
Paper
•
2411.02394
•
Published
•
17
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse
Autoencoders
Paper
•
2410.22366
•
Published
•
77
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Paper
•
2410.10812
•
Published
•
16
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise
Motion Control
Paper
•
2410.13830
•
Published
•
24
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion
Models
Paper
•
2411.05007
•
Published
•
16
Add-it: Training-Free Object Insertion in Images With Pretrained
Diffusion Models
Paper
•
2411.07232
•
Published
•
63
OmniEdit: Building Image Editing Generalist Models Through Specialist
Supervision
Paper
•
2411.07199
•
Published
•
46
MagicQuill: An Intelligent Interactive Image Editing System
Paper
•
2411.09703
•
Published
•
62
AnimateAnything: Consistent and Controllable Animation for Video
Generation
Paper
•
2411.10836
•
Published
•
23
Stylecodes: Encoding Stylistic Information For Image Generation
Paper
•
2411.12811
•
Published
•
11
VideoRepair: Improving Text-to-Video Generation via Misalignment
Evaluation and Localized Refinement
Paper
•
2411.15115
•
Published
•
9
Style-Friendly SNR Sampler for Style-Driven Generation
Paper
•
2411.14793
•
Published
•
36
OminiControl: Minimal and Universal Control for Diffusion Transformer
Paper
•
2411.15098
•
Published
•
53
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
Paper
•
2412.01169
•
Published
•
11
SNOOPI: Supercharged One-step Diffusion Distillation with Proper
Guidance
Paper
•
2412.02687
•
Published
•
108
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic
Adversarial Training
Paper
•
2412.02030
•
Published
•
18
MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with
Mixture of Score Guidance
Paper
•
2412.05355
•
Published
•
7
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution
Image Synthesis
Paper
•
2412.04431
•
Published
•
17
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for
Customized Manga Generation
Paper
•
2412.07589
•
Published
•
46
FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion
Models
Paper
•
2412.07674
•
Published
•
20
UniReal: Universal Image Generation and Editing via Learning Real-world
Dynamics
Paper
•
2412.07774
•
Published
•
25
LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style
Conditioned Image Generation
Paper
•
2412.05148
•
Published
•
11
ObjCtrl-2.5D: Training-free Object Control with Camera Poses
Paper
•
2412.07721
•
Published
•
8
StyleMaster: Stylize Your Video with Artistic Generation and Translation
Paper
•
2412.07744
•
Published
•
19
FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow
Models
Paper
•
2412.08629
•
Published
•
11
DisPose: Disentangling Pose Guidance for Controllable Human Image
Animation
Paper
•
2412.09349
•
Published
•
8
LAION-SG: An Enhanced Large-Scale Dataset for Training Complex
Image-Text Models with Structural Annotations
Paper
•
2412.08580
•
Published
•
45
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via
Multimodal LLM
Paper
•
2412.09618
•
Published
•
21
LoRACLR: Contrastive Adaptation for Customization of Diffusion Models
Paper
•
2412.09622
•
Published
•
7
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free
Scale Fusion
Paper
•
2412.09626
•
Published
•
19
Flowing from Words to Pixels: A Framework for Cross-Modality Evolution
Paper
•
2412.15213
•
Published
•
25
BrushEdit: All-In-One Image Inpainting and Editing
Paper
•
2412.10316
•
Published
•
33
Paper
•
2412.18653
•
Published
•
63
From Elements to Design: A Layered Approach for Automatic Graphic Design
Composition
Paper
•
2412.19712
•
Published
•
14
VideoMaker: Zero-shot Customized Video Generation with the Inherent
Force of Video Diffusion Models
Paper
•
2412.19645
•
Published
•
13