Zhuofan Zong's picture

2 10 7

Zhuofan Zong

zongzhuofan

·

AI & ML interests

None yet

Recent Activity

updated a model 8 days ago

zongzhuofan/co-detr-vit-large-coco-instance

authored a paper 18 days ago

VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping

upvoted a paper 20 days ago

VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping

View all activity

Organizations

zongzhuofan's activity

upvoted 2 papers 20 days ago

VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping

Paper • 2412.11279 • Published 22 days ago • 12

Causal Diffusion Transformers for Generative Modeling

Paper • 2412.12095 • Published 21 days ago • 23

upvoted a paper 24 days ago

EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM

Paper • 2412.09618 • Published 25 days ago • 21

upvoted a paper 25 days ago

StreamChat: Chatting with Streaming Video

Paper • 2412.08646 • Published 26 days ago • 17

upvoted a collection 2 months ago

Co-DETR

State-of-the-art detection and segmentation models. • 5 items • Updated Nov 3, 2024 • 1

upvoted a paper 4 months ago

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines

Paper • 2409.12959 • Published Sep 19, 2024 • 37

upvoted 2 papers 7 months ago

MoVA: Adapting Mixture of Vision Experts to Multimodal Context

Paper • 2404.13046 • Published Apr 19, 2024 • 1

Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models

Paper • 2406.11831 • Published Jun 17, 2024 • 21

upvoted a paper 9 months ago

RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths

Paper • 2305.18295 • Published May 29, 2023 • 7

upvoted a paper 10 months ago

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21, 2024 • 51