Sparsh Collection Models and datasets for Sparsh: Self-supervised touch representations for vision-based tactile sensing • 15 items • Updated Oct 24, 2024 • 12
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning Paper • 2412.15797 • Published 17 days ago • 16
Progressive Multimodal Reasoning via Active Retrieval Paper • 2412.14835 • Published 18 days ago • 70
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces Paper • 2412.14171 • Published 19 days ago • 23
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Paper • 2411.14405 • Published Nov 21, 2024 • 58
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation Paper • 2412.10704 • Published 23 days ago • 15
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Paper • 1810.04805 • Published Oct 11, 2018 • 16
Bamba Collection Collection of Bamba - hybrid Mamba2 model architecture based models trained on open data • 8 items • Updated 19 days ago • 17
Navarasa Collection Collection of Gemma finetuned 7B/ 2B Indic Navarasa models. • 4 items • Updated Mar 18, 2024 • 2
Navarasa 2.0 Models Collection Collection of models Navarasa 2.0 Models finetuned with Gemma on 15 Indian languages • 5 items • Updated Mar 18, 2024 • 18
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions Paper • 2401.01827 • Published Jan 3, 2024 • 16
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases Paper • 2412.04862 • Published about 1 month ago • 49
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction Paper • 2412.04454 • Published Dec 5, 2024 • 57
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published Dec 5, 2024 • 59
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published Nov 28, 2024 • 13
Star Attention: Efficient LLM Inference over Long Sequences Paper • 2411.17116 • Published Nov 26, 2024 • 48
MagicQuill: An Intelligent Interactive Image Editing System Paper • 2411.09703 • Published Nov 14, 2024 • 62