Bryan Teo

bteo98

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

upvoted a paper 12 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

upvoted a paper 12 days ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

View all activity

Organizations

None yet

bteo98's activity

upvoted a paper 3 days ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published 14 days ago • 63

upvoted 2 papers 12 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 13 days ago • 286

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published 13 days ago • 78

upvoted 4 papers about 2 months ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 125

upvoted a paper 5 months ago

WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling

Paper • 2408.16532 • Published Aug 29, 2024 • 48

upvoted a paper 6 months ago

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 58

upvoted a paper 7 months ago

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 161

upvoted a paper 8 months ago

Transformers meet Neural Algorithmic Reasoners

Paper • 2406.09308 • Published Jun 13, 2024 • 44

upvoted 4 papers 9 months ago

Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16, 2024 • 130

Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting

Paper • 2404.18911 • Published Apr 29, 2024 • 30

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22, 2024 • 127

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25, 2024 • 77

upvoted a paper 10 months ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 256

upvoted a paper 11 months ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 184

upvoted a paper about 1 year ago

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 157

upvoted a paper over 1 year ago

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

Paper • 2309.02591 • Published Sep 5, 2023 • 15