-
Analyzing The Language of Visual Tokens
Paper • 2411.05001 • Published • 22 -
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models
Paper • 2411.14982 • Published • 16 -
Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration
Paper • 2411.17686 • Published • 19
Jaehyun Jun
btjhjeon
AI & ML interests
Multimodal
Recent Activity
updated
a collection
about 18 hours ago
Multimodal Alignment
upvoted
a
paper
about 18 hours ago
MLLM-as-a-Judge for Image Safety without Human Labeling
updated
a collection
about 19 hours ago
Multimodal Dataset
Organizations
Collections
8
-
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Paper • 2410.17637 • Published • 34 -
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Paper • 2411.10442 • Published • 68 -
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Paper • 2411.18203 • Published • 32 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 21
models
None public yet
datasets
None public yet