Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.19296

Running on CPU Upgrade

1.41k

🏢

Anychat
Running

257

🐢

Qwen2.5 Coder Artifacts
Running

868

🔍

QwQ-32B-Preview

QwQ-32B-Preview
Running on CPU Upgrade

12.3k

🏆

Open LLM Leaderboard

Track, rank and evaluate open LLMs and chatbots

Perception and abstraction. Each modality is tokenized and embedded into vectors for model to comprehend.

VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24, 2024 • 40
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117
Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published Jun 26, 2024 • 48
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Paper • 2408.15518 • Published Aug 28, 2024 • 43

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 127
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 607
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14, 2024 • 126
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 107

Papers - Ensemble

Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117

Papers - Nexa AI

Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117

Papers - Octopus

Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117

Papers - Agent - Tasks

LEGENT: Open Platform for Embodied Agents

Paper • 2404.18243 • Published Apr 28, 2024 • 22
Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations

Paper • 2404.17521 • Published Apr 26, 2024 • 13
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 14

Specific Models

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 255
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22, 2024 • 45
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117
DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published 25 days ago • 25

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 607
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 96
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 104
TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14, 2024 • 44

On the Scalability of GNNs for Molecular Graphs

Paper • 2404.11568 • Published Apr 17, 2024 • 1
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117
Architectures of Topological Deep Learning: A Survey on Topological Neural Networks

Paper • 2304.10031 • Published Apr 20, 2023 • 3
Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold

Paper • 2408.14608 • Published Aug 26, 2024 • 8

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs