Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.12588

What I don't understand

Gap in knowledge is the place I learn the most from.

Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations

Paper • 2408.10920 • Published Aug 20, 2024 • 1
Towards Exact Computation of Inductive Bias

Paper • 2406.15941 • Published Jun 22, 2024
DrawingSpinUp: 3D Animation from Single Character Drawings

Paper • 2409.08615 • Published Sep 13, 2024 • 18
Learning to Move Like Professional Counter-Strike Players

Paper • 2408.13934 • Published Aug 25, 2024 • 23

AI Math: Diffusion

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 64
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Paper • 2408.12590 • Published Aug 22, 2024 • 35
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 16
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 58

MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model

Paper • 2405.20222 • Published May 30, 2024 • 11
ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation

Paper • 2406.00908 • Published Jun 3, 2024 • 11
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation

Paper • 2406.02509 • Published Jun 4, 2024 • 9
I4VGen: Image as Stepping Stone for Text-to-Video Generation

Paper • 2406.02230 • Published Jun 4, 2024 • 16

aigc acceleration

SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions

Paper • 2403.16627 • Published Mar 25, 2024 • 20
Phased Consistency Model

Paper • 2405.18407 • Published May 28, 2024 • 46
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 28
Imp: Highly Capable Large Multimodal Models for Mobile Devices

Paper • 2405.12107 • Published May 20, 2024 • 26

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29, 2024 • 32
AnimateDiff-Lightning: Cross-Model Diffusion Distillation

Paper • 2403.12706 • Published Mar 19, 2024 • 17
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 16

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 25
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 12
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 40
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 20

about 15 hours ago

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Paper • 2401.09985 • Published Jan 18, 2024 • 15
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

Paper • 2401.09962 • Published Jan 18, 2024 • 8
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution

Paper • 2401.10404 • Published Jan 18, 2024 • 10
ActAnywhere: Subject-Aware Video Background Generation

Paper • 2401.10822 • Published Jan 19, 2024 • 13

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs