BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper • 2411.10640 • Published Nov 16, 2024 • 44
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper • 2411.10640 • Published Nov 16, 2024 • 44
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree Paper • 2410.16268 • Published Oct 21, 2024 • 66
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Paper • 2410.13861 • Published Oct 17, 2024 • 53
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code Paper • 2410.08196 • Published Oct 10, 2024 • 45
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models Paper • 2402.14800 • Published Feb 22, 2024 • 3
TerDiT: Ternary Diffusion Models with Transformers Paper • 2405.14854 • Published May 23, 2024 • 2
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models Paper • 2405.16057 • Published May 25, 2024
ThinK: Thinner Key Cache by Query-Driven Pruning Paper • 2407.21018 • Published Jul 30, 2024 • 31
ThinK: Thinner Key Cache by Query-Driven Pruning Paper • 2407.21018 • Published Jul 30, 2024 • 31
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models Paper • 2407.07895 • Published Jul 10, 2024 • 40
timm/ViT-SO400M-14-SigLIP-384 Zero-Shot Image Classification • Updated Oct 27, 2023 • 17.3k • 79
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning Paper • 2407.00782 • Published Jun 30, 2024 • 23
PowerInfer-2: Fast Large Language Model Inference on a Smartphone Paper • 2406.06282 • Published Jun 10, 2024 • 36
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 254