TinyFusion: Diffusion Transformers Learned Shallow Paper • 2412.01199 • Published Dec 2, 2024 • 14
Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient Paper • 2411.17787 • Published Nov 26, 2024 • 11
ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation Paper • 2312.13108 • Published Dec 20, 2023 • 3
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use Paper • 2411.10323 • Published Nov 15, 2024 • 31
PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis Paper • 2408.09481 • Published Aug 18, 2024 • 1
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising Paper • 2406.06911 • Published Jun 11, 2024 • 10
GFlow: Recovering 4D World from Monocular Video Paper • 2405.18426 • Published May 28, 2024 • 15
DeepCache: Accelerating Diffusion Models for Free Paper • 2312.00858 • Published Dec 1, 2023 • 21
AMSP: Super-Scaling LLM Training via Advanced Model States Partitioning Paper • 2311.00257 • Published Nov 1, 2023 • 8
UniVTG: Towards Unified Video-Language Temporal Grounding Paper • 2307.16715 • Published Jul 31, 2023 • 11
Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity Paper • 2305.11675 • Published May 19, 2023 • 1
MetaFormer Is Actually What You Need for Vision Paper • 2111.11418 • Published Nov 22, 2021 • 1