Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2310.06825

It is a collection of papers that are useful in studying LLM.

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 50
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 31
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 53
Lost in the Middle: How Language Models Use Long Contexts

Paper • 2307.03172 • Published Jul 6, 2023 • 38

The least restrained datasets and models

open-llm-leaderboard-old/details_TheBloke__Guanaco-3B-Uncensored-v2-GPTQ

Updated Oct 29, 2023 • 97
open-llm-leaderboard-old/details_TheBloke__WizardLM-13B-V1-1-SuperHOT-8K-GPTQ

Updated Oct 28, 2023 • 149
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 47
NousResearch/Yarn-Mistral-7b-128k

Text Generation • Updated Nov 2, 2023 • 20.7k • 573

general descriptions of LLMs

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 123
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 47
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 245
flax-community/gpt-2-spanish

Text Generation • Updated Apr 1, 2024 • 995 • 27

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 47

Training & Architectures

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 50
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Paper • 2307.08691 • Published Jul 17, 2023 • 8
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 158
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 47

Llemma: An Open Language Model For Mathematics

Paper • 2310.10631 • Published Oct 16, 2023 • 52
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 47
Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 35
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model

Paper • 2309.11568 • Published Sep 20, 2023 • 10

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 47

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 50
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Paper • 2005.11401 • Published May 22, 2020 • 11
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 31
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Paper • 2205.14135 • Published May 27, 2022 • 13

Stuff I (TheProjectsGuy) have summarized (for time pass). Mostly papers. I do not guarantee that the summaries are fully correct (as I am no expert).

SIMPL: A Simple and Efficient Multi-agent Motion Prediction Baseline for Autonomous Driving

Paper • 2402.02519 • Published Feb 4, 2024
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 158
Optimal Transport Aggregation for Visual Place Recognition

Paper • 2311.15937 • Published Nov 27, 2023
GOAT: GO to Any Thing

Paper • 2311.06430 • Published Nov 10, 2023 • 14

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 609
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 158
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 47
Don't Make Your LLM an Evaluation Benchmark Cheater

Paper • 2311.01964 • Published Nov 3, 2023 • 1

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs