-
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks
Paper • 2406.12925 • Published • 24 -
Scaling Laws for Linear Complexity Language Models
Paper • 2406.16690 • Published • 23 -
DiffusionPDE: Generative PDE-Solving Under Partial Observation
Paper • 2406.17763 • Published • 24 -
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds
Paper • 2407.01494 • Published • 13
Collections
Discover the best community collections!
Collections including paper arxiv:2407.01494
-
SoundCTM: Uniting Score-based and Consistency Models for Text-to-Sound Generation
Paper • 2405.18503 • Published • 9 -
DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation
Paper • 2405.20289 • Published • 11 -
LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive Modeling of Audio Discrete Codes
Paper • 2406.02897 • Published • 14 -
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning
Paper • 2406.03344 • Published • 19
-
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 20 -
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper • 2403.07508 • Published • 75 -
DragAnything: Motion Control for Anything using Entity Representation
Paper • 2403.07420 • Published • 14 -
Learning and Leveraging World Models in Visual Representation Learning
Paper • 2403.00504 • Published • 32
-
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper • 2310.00704 • Published • 21 -
Structural Similarities Between Language Models and Neural Response Measurements
Paper • 2306.01930 • Published • 2 -
Streaming Transformer ASR with Blockwise Synchronous Beam Search
Paper • 2006.14941 • Published • 2 -
NU-GAN: High resolution neural upsampling with GAN
Paper • 2010.11362 • Published • 2
-
U-Net: Convolutional Networks for Biomedical Image Segmentation
Paper • 1505.04597 • Published • 9 -
Image Segmentation using U-Net Architecture for Powder X-ray Diffraction Images
Paper • 2310.16186 • Published • 2 -
H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes
Paper • 1709.07330 • Published • 2 -
Deep LOGISMOS: Deep Learning Graph-based 3D Segmentation of Pancreatic Tumors on CT scans
Paper • 1801.08599 • Published • 2