Nguyen Van Thanh's picture

1034

Nguyen Van Thanh

NguyenVanThanhHust

AI & ML interests

None yet

Recent Activity

upvoted a paper 19 minutes ago

Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies

upvoted a paper 19 minutes ago

UniFL: Improve Stable Diffusion via Unified Feedback Learning

upvoted a paper 19 minutes ago

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

View all activity

Organizations

None yet

NguyenVanThanhHust's activity

upvoted 12 papers 19 minutes ago

Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies

Paper • 2404.08197 • Published Apr 12, 2024 • 28

UniFL: Improve Stable Diffusion via Unified Feedback Learning

Paper • 2404.05595 • Published Apr 8, 2024 • 24

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

Paper • 2404.05014 • Published Apr 7, 2024 • 33

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Paper • 2404.05726 • Published Apr 8, 2024 • 21

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8, 2024 • 83

Koala: Key frame-conditioned long video-LLM

Paper • 2404.04346 • Published Apr 5, 2024 • 6

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing

Paper • 2404.05717 • Published Apr 8, 2024 • 25

DATENeRF: Depth-Aware Text-based Editing of NeRFs

Paper • 2404.04526 • Published Apr 6, 2024 • 10

Aligning Diffusion Models by Optimizing Human Utility

Paper • 2404.04465 • Published Apr 6, 2024 • 14

BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion

Paper • 2404.04544 • Published Apr 6, 2024 • 21

ByteEdit: Boost, Comply and Accelerate Generative Image Editing

Paper • 2404.04860 • Published Apr 7, 2024 • 25

YaART: Yet Another ART Rendering Technology

Paper • 2404.05666 • Published Apr 8, 2024 • 16

upvoted 8 papers 20 minutes ago

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

Paper • 2404.05674 • Published Apr 8, 2024 • 14

PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations

Paper • 2404.04421 • Published Apr 5, 2024 • 17

SpatialTracker: Tracking Any 2D Pixels in 3D Space

Paper • 2404.04319 • Published Apr 5, 2024 • 24

Music Consistency Models

Paper • 2404.13358 • Published Apr 20, 2024 • 13

MultiBooth: Towards Generating All Your Concepts in an Image from Text

Paper • 2404.14239 • Published Apr 22, 2024 • 9

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation

Paper • 2404.14396 • Published Apr 22, 2024 • 19

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 254

A Multimodal Automated Interpretability Agent

Paper • 2404.14394 • Published Apr 22, 2024 • 21