29 73 9

Byung-Kwan Lee

BK-Lee

https://sites.google.com/view/byungkwanlee

AI & ML interests

Computer Vision, Machine Learning, Large Language and Vision Models, Efficient Modeling

Recent Activity

upvoted a paper 3 days ago

Are Vision-Language Models Truly Understanding Multi-vision Sensor?

upvoted a paper 15 days ago

Qwen2.5 Technical Report

upvoted a paper 19 days ago

Are Your LLMs Capable of Stable Reasoning?

View all activity

Organizations

BK-Lee's activity

upvoted a paper 3 days ago

Are Vision-Language Models Truly Understanding Multi-vision Sensor?

Paper • 2412.20750 • Published 7 days ago • 17

upvoted a paper 15 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 17 days ago • 335

upvoted a paper 19 days ago

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published 19 days ago • 91

upvoted 2 papers 24 days ago

Phi-4 Technical Report

Paper • 2412.08905 • Published 25 days ago • 96

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published 24 days ago • 92

upvoted 4 papers 28 days ago

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published about 1 month ago • 123

upvoted a paper about 1 month ago

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models

Paper • 2412.01822 • Published Dec 2, 2024 • 14

upvoted a paper about 2 months ago

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published Nov 15, 2024 • 112

upvoted 9 papers 3 months ago

FlatQuant: Flatness Matters for LLM Quantization

Paper • 2410.09426 • Published Oct 12, 2024 • 13

Pixtral 12B

Paper • 2410.07073 • Published Oct 9, 2024 • 62

Intriguing Properties of Large Language and Vision Models

Paper • 2410.04751 • Published Oct 7, 2024 • 16

MM-Ego: Towards Building Egocentric Multimodal LLMs

Paper • 2410.07177 • Published Oct 9, 2024 • 21

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Paper • 2409.17066 • Published Sep 25, 2024 • 28

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26, 2024 • 53

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27, 2024 • 94

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Paper • 2409.17481 • Published Sep 26, 2024 • 46

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 106