Gen AI Diffusion - a Stalin16 Collection

Stalin16 's Collections

Data and other things

Gen AI Diffusion

Gen AI Diffusion

updated 4 days ago

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14, 2024 • 54
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 70
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5, 2024 • 25
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9, 2024 • 41
Story-Adapter: A Training-free Iterative Framework for Long Story Visualization

Paper • 2410.06244 • Published Oct 8, 2024 • 19
How Far is Video Generation from World Model: A Physical Law Perspective

Paper • 2411.02385 • Published Nov 4, 2024 • 33
Training-free Regional Prompting for Diffusion Transformers

Paper • 2411.02395 • Published Nov 4, 2024 • 25
AutoVFX: Physically Realistic Video Editing from Natural Language Instructions

Paper • 2411.02394 • Published Nov 4, 2024 • 17
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28, 2024 • 77
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Paper • 2410.10812 • Published Oct 14, 2024 • 16
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control

Paper • 2410.13830 • Published Oct 17, 2024 • 24
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Paper • 2411.05007 • Published Nov 7, 2024 • 16
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Paper • 2411.07232 • Published Nov 11, 2024 • 63
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision

Paper • 2411.07199 • Published Nov 11, 2024 • 46
MagicQuill: An Intelligent Interactive Image Editing System

Paper • 2411.09703 • Published Nov 14, 2024 • 62
AnimateAnything: Consistent and Controllable Animation for Video Generation

Paper • 2411.10836 • Published Nov 16, 2024 • 23
Stylecodes: Encoding Stylistic Information For Image Generation

Paper • 2411.12811 • Published Nov 19, 2024 • 11
VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

Paper • 2411.15115 • Published Nov 22, 2024 • 9
Style-Friendly SNR Sampler for Style-Driven Generation

Paper • 2411.14793 • Published Nov 22, 2024 • 36
OminiControl: Minimal and Universal Control for Diffusion Transformer

Paper • 2411.15098 • Published Nov 22, 2024 • 53
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows

Paper • 2412.01169 • Published Dec 2, 2024 • 11
SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance

Paper • 2412.02687 • Published Dec 3, 2024 • 108
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training

Paper • 2412.02030 • Published Dec 2, 2024 • 18
MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with Mixture of Score Guidance

Paper • 2412.05355 • Published 28 days ago • 7
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Paper • 2412.04431 • Published 29 days ago • 17
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published 25 days ago • 46
FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models

Paper • 2412.07674 • Published 24 days ago • 20
UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics

Paper • 2412.07774 • Published 24 days ago • 25
LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation

Paper • 2412.05148 • Published 29 days ago • 11
ObjCtrl-2.5D: Training-free Object Control with Camera Poses

Paper • 2412.07721 • Published 24 days ago • 8
StyleMaster: Stylize Your Video with Artistic Generation and Translation

Paper • 2412.07744 • Published 24 days ago • 19
FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models

Paper • 2412.08629 • Published 23 days ago • 11
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation

Paper • 2412.09349 • Published 23 days ago • 8
LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations

Paper • 2412.08580 • Published 23 days ago • 45
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM

Paper • 2412.09618 • Published 22 days ago • 21
LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

Paper • 2412.09622 • Published 22 days ago • 7
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion

Paper • 2412.09626 • Published 22 days ago • 19
Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Paper • 2412.15213 • Published 15 days ago • 25
BrushEdit: All-In-One Image Inpainting and Editing

Paper • 2412.10316 • Published 21 days ago • 33
1.58-bit FLUX

Paper • 2412.18653 • Published 10 days ago • 63
From Elements to Design: A Layered Approach for Automatic Graphic Design Composition

Paper • 2412.19712 • Published 8 days ago • 14
VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models

Paper • 2412.19645 • Published 8 days ago • 13