Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2411.05001

about 15 hours ago

iVideoGPT: Interactive VideoGPTs are Scalable World Models

Paper • 2405.15223 • Published May 24, 2024 • 12
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24, 2024 • 53
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 87
Matryoshka Multimodal Models

Paper • 2405.17430 • Published May 27, 2024 • 31

about 2 hours ago

Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant

Paper • 2410.13360 • Published Oct 17, 2024 • 8
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning

Paper • 2411.18203 • Published Nov 27, 2024 • 33
Towards Interpreting Visual Information Processing in Vision-Language Models

Paper • 2410.07149 • Published Oct 9, 2024 • 1
Understanding Alignment in Multimodal LLMs: A Comprehensive Study

Paper • 2407.02477 • Published Jul 2, 2024 • 21

Multimodal Analysis

Analyzing The Language of Visual Tokens

Paper • 2411.05001 • Published Nov 7, 2024 • 23
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models

Paper • 2411.14982 • Published Nov 22, 2024 • 16
Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration

Paper • 2411.17686 • Published Nov 26, 2024 • 19

Papers - Image - Datasets - XM-3600

Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset

Paper • 2205.12522 • Published May 25, 2022 • 2
Analyzing The Language of Visual Tokens

Paper • 2411.05001 • Published Nov 7, 2024 • 23

Papers - Text - Tokens - Vocabulary - Herdan’s Law

Analyzing The Language of Visual Tokens

Paper • 2411.05001 • Published Nov 7, 2024 • 23

Papers - Text - Tokens - Vocabulary - Heaps Law

Analyzing The Language of Visual Tokens

Paper • 2411.05001 • Published Nov 7, 2024 • 23
Heaps' law and Heaps functions in tagged texts: Evidences of their linguistic relevance

Paper • 2001.02178 • Published Jan 7, 2020 • 1

Papers - Image - Visual Tokens

Analyzing The Language of Visual Tokens

Paper • 2411.05001 • Published Nov 7, 2024 • 23

Papers - Image - Zipf

Analyzing The Language of Visual Tokens

Paper • 2411.05001 • Published Nov 7, 2024 • 23

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21, 2024 • 44
AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published Oct 21, 2024 • 59
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

Paper • 2410.12787 • Published Oct 16, 2024 • 31
LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks

Paper • 2410.01744 • Published Oct 2, 2024 • 26

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Paper • 2409.10516 • Published Sep 16, 2024 • 41
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Paper • 2409.11242 • Published Sep 17, 2024 • 5
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

Paper • 2409.11136 • Published Sep 17, 2024 • 22
On the Diagram of Thought

Paper • 2409.10038 • Published Sep 16, 2024 • 13

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs