zhaoyuzhong's picture

3 3 1

zhaoyuzhong

callsys

·

AI & ML interests

computer vision

Recent Activity

upvoted a paper about 1 month ago

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

upvoted a paper about 1 month ago

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

View all activity

Organizations

None yet

callsys's activity

upvoted 2 papers about 1 month ago

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

Paper • 2412.02611 • Published Dec 3, 2024 • 23

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Paper • 2411.19108 • Published Nov 28, 2024 • 17

New activity in microsoft/kosmos-2.5 4 months ago

Upload receipt_00008.png

#10 opened 4 months ago by

change image

#9 opened 4 months ago by

liked a model 4 months ago

microsoft/kosmos-2.5-chat

Updated Aug 28, 2024 • 205 • 10

New activity in microsoft/kosmos-2.5-chat 4 months ago

checkpoint

#1 opened 4 months ago by

upvoted a paper 7 months ago

DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution

Paper • 2405.16071 • Published May 25, 2024 • 2