Collections
Discover the best community collections!
Collections including paper arxiv:2404.19296
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 40 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 117 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 48 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 43
-
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 127 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 607 -
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 126 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 107
-
LEGENT: Open Platform for Embodied Agents
Paper • 2404.18243 • Published • 22 -
Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations
Paper • 2404.17521 • Published • 13 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 117 -
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 14
-
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 255 -
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
Paper • 2404.14047 • Published • 45 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 117 -
DeepSeek-V3 Technical Report
Paper • 2412.19437 • Published • 25
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 607 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 96 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 104 -
TransformerFAM: Feedback attention is working memory
Paper • 2404.09173 • Published • 44
-
On the Scalability of GNNs for Molecular Graphs
Paper • 2404.11568 • Published • 1 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 117 -
Architectures of Topological Deep Learning: A Survey on Topological Neural Networks
Paper • 2304.10031 • Published • 3 -
Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold
Paper • 2408.14608 • Published • 8