Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.01808

image-generation

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 28
black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Aug 16, 2024 • 1.17M • • 7.75k
Qwen/Qwen2-VL-7B-Instruct

Image-Text-to-Text • Updated about 1 month ago • 1.6M • 1.01k
zer0int/CLIP-GmP-ViT-L-14

Zero-Shot Image Classification • Updated Sep 23, 2024 • 4.98k • 362

computer vision papers 👓

Rich feature hierarchies for accurate object detection and semantic segmentation

Paper • 1311.2524 • Published Nov 11, 2013 • 1
DeepPose: Human Pose Estimation via Deep Neural Networks

Paper • 1312.4659 • Published Dec 17, 2013 • 1
Generative Adversarial Networks

Paper • 1406.2661 • Published Jun 10, 2014 • 2
scikit-image: Image processing in Python

Paper • 1407.6245 • Published Jul 23, 2014 • 1

image-generation

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 28

Image generation

Boundary Attention: Learning to Find Faint Boundaries at Any Resolution

Paper • 2401.00935 • Published Jan 1, 2024 • 17
Taming Mode Collapse in Score Distillation for Text-to-3D Generation

Paper • 2401.00909 • Published Dec 31, 2023 • 9
Q-Refine: A Perceptual Quality Refiner for AI-Generated Image

Paper • 2401.01117 • Published Jan 2, 2024 • 8
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data

Paper • 2401.01173 • Published Jan 2, 2024 • 11

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 28
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3, 2024 • 27
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity

Paper • 2401.00604 • Published Dec 31, 2023 • 4
LARP: Language-Agent Role Play for Open-World Games

Paper • 2312.17653 • Published Dec 24, 2023 • 31

Vision Foundation Models

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 28
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models

Paper • 2401.05252 • Published Jan 10, 2024 • 47
Scalable Pre-training of Large Autoregressive Image Models

Paper • 2401.08541 • Published Jan 16, 2024 • 36
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Paper • 2401.09417 • Published Jan 17, 2024 • 59

GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis

Paper • 2312.02155 • Published Dec 4, 2023 • 12
LivePhoto: Real Image Animation with Text-guided Motion Control

Paper • 2312.02928 • Published Dec 5, 2023 • 16
FaceStudio: Put Your Face Everywhere in Seconds

Paper • 2312.02663 • Published Dec 5, 2023 • 30
aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 28

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs