1 9 2

Yuhui Zhang

yuhuizhang

https://cs.stanford.edu/~yuhuiz/

AI & ML interests

ML, NLP, CV

Recent Activity

liked a dataset 8 days ago

suyc21/VMCBench

upvoted a paper 8 days ago

Temporal Preference Optimization for Long-Form Video Understanding

authored a paper 8 days ago

Temporal Preference Optimization for Long-Form Video Understanding

View all activity

Organizations

None yet

yuhuizhang's activity

liked a dataset 8 days ago

suyc21/VMCBench

Viewer • Updated 26 days ago • 9.02k • 107 • 2

upvoted a paper 8 days ago

Temporal Preference Optimization for Long-Form Video Understanding

Paper • 2501.13919 • Published 9 days ago • 21

authored a paper 8 days ago

Temporal Preference Optimization for Long-Form Video Understanding

Paper • 2501.13919 • Published 9 days ago • 21

upvoted a paper 18 days ago

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

Paper • 2501.07171 • Published 19 days ago • 49

authored 2 papers 18 days ago

Why are Visually-Grounded Language Models Bad at Image Classification?

Paper • 2405.18415 • Published May 28, 2024

Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation

Paper • 2501.03225 • Published 26 days ago • 7

upvoted a paper 25 days ago

Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation

Paper • 2501.03225 • Published 26 days ago • 7

commented a paper 25 days ago

Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation

Paper • 2501.03225 • Published 26 days ago • 7 •

updated a Space about 1 month ago

Running

🚀

AutoConverter

upvoted a paper about 2 months ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 139

updated a Space 3 months ago

Sleeping

🚀

TMLRReview

upvoted a paper 7 months ago

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

Paper • 2407.06189 • Published Jul 8, 2024 • 26

authored 4 papers 7 months ago

Can large language models provide useful feedback on research papers? A large-scale empirical analysis

Paper • 2310.01783 • Published Oct 3, 2023 • 1

Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data

Paper • 2401.08567 • Published Jan 16, 2024

μ-Bench: A Vision-Language Benchmark for Microscopy Understanding

Paper • 2407.01791 • Published Jul 1, 2024 • 5

Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation

Paper • 2311.16201 • Published Nov 27, 2023

authored a paper 11 months ago

VideoAgent: Long-form Video Understanding with Large Language Model as Agent

Paper • 2403.10517 • Published Mar 15, 2024 • 33

upvoted a paper 11 months ago

VideoAgent: Long-form Video Understanding with Large Language Model as Agent

Paper • 2403.10517 • Published Mar 15, 2024 • 33

updated 2 models 11 months ago

yuhuizhang/finetuned_gpt2_pretrainedTrue_mrpc_new_epochs20

Text Generation • Updated Mar 14, 2024 • 5

yuhuizhang/finetuned_gpt2-medium_pretrainedTrue_cola_epochs3

Updated Mar 14, 2024