Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2403.03507

Runtime error

11

📊

OpenWakeWord
xai-org/grok-1

Text Generation • Updated Mar 28, 2024 • 332 • 2.22k
databricks/dbrx-instruct

Text Generation • Updated Apr 19, 2024 • 272k • 1.11k
mistralai/Mistral-7B-Instruct-v0.2

Text Generation • Updated Sep 27, 2024 • 3.46M • • 2.62k

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 106
sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28, 2024 • 41
ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27, 2024 • 53
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Paper • 2403.18814 • Published Mar 27, 2024 • 46

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 184

Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU

Paper • 2403.06504 • Published Mar 11, 2024 • 53
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 62
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 184

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 184

Lora variations

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 184
Flora: Low-Rank Adapters Are Secretly Gradient Compressors

Paper • 2402.03293 • Published Feb 5, 2024 • 6
PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation

Paper • 2401.11316 • Published Jan 20, 2024 • 1
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 47

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 184
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning

Paper • 2205.05638 • Published May 11, 2022 • 3
The Power of Scale for Parameter-Efficient Prompt Tuning

Paper • 2104.08691 • Published Apr 18, 2021 • 10
In-Context Learning Demonstration Selection via Influence Analysis

Paper • 2402.11750 • Published Feb 19, 2024 • 2

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 184

Papers I find interesting

Scaling Instruction-Finetuned Language Models

Paper • 2210.11416 • Published Oct 20, 2022 • 7
Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 139
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Paper • 2403.05530 • Published Mar 8, 2024 • 62
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 62

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 184
Mixture-of-Subspaces in Low-Rank Adaptation

Paper • 2406.11909 • Published Jun 16, 2024 • 3
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Paper • 2406.17660 • Published Jun 25, 2024 • 5
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients

Paper • 2407.11239 • Published Jul 15, 2024 • 8

Previous
1
2
3
4
...
6
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs