50 Leaderboards, Benchs, GGUF Tools, and Utilities
GGUF, Prompt gen, Repo tools, followed by "Bench" and "Leaderboards". Leader boards get more specific going down. See also: "Run LLMs ..." collection.
- Running335π
- Running196βοΈ
PROMPT++
Refine your prompts
- Running on A10G1.08kπ¦
GGUF My Repo
- Running on CPU Upgrade29β‘
Nexa AI GGUF Convertor
- Running34π’
GGUF Editor
- Running on CPU Upgrade23π¦
GGUF My Lora
Convert your PEFT LoRA into GGUF
- Running228π»
Repo duplicator
- Running32π
MLX My Repo
- Runtime error1.22kπ¨π»βπ€
ChatGPT Prompt Generator
Running21πEQ Bench
Note A specialized Bench for evaluation of the creativity of a model with testing outputs shown as well as judgements / ratings including a model's "emotional intelligence".
Running549π’UGI Leaderboard
Note Uncensored General Intelligence. Another great source for creative and/or role play models.
- Running on CPU Upgrade12.2kπ
Open LLM Leaderboard
Track, rank and evaluate open LLMs and chatbots
- Running178π
MT Bench
- Running160π
GPU Poor LLM Arena
Compact LLM Battle Arena: Frugal AI Face-Off!
- Running268π¨
LLM Performance Leaderboard
- Running on CPU Upgrade78π
Open LLM Leaderboard Model Comparator
Compare Open LLM Leaderboard results
- Running184π
Yet Another LLM Leaderboard
- Running3.83kππ€
Chatbot Arena Leaderboard
- Running17π
JudgeBench Leaderboard
- Running309π
Reward Bench Leaderboard
- Running on CPU Upgrade4.49kπ₯
MTEB Leaderboard
- Running27π₯
MEGA-Bench
A leaderboard for multimodal models
- Running398πποΈ
LLM-Perf Leaderboard
- Running on CPU Upgrade50π₯
Open CoT Leaderboard
Track, rank and evaluate open LLMs' CoT quality
- Running8π
Open CoT Dashboard
- Running157π
Low-bit Quantized Open LLM Leaderboard
Track, rank and evaluate open LLMs and chatbots
- Runtime error29π
Open LLM Leaderboard for domains
Ranking for Open-sourced LLMs in different domains
- Running on CPU Upgrade173π₯
MMLU Pro
More advanced and challenging multi-task evaluation
- Running134π€π
TTS Spaces Arena
Vote on the top HF TTS models!
- Running157π₯
BigCodeBench Leaderboard
- Running1.05kπ
Big Code Models Leaderboard
- Running317π
Text To Image Leaderboard
- Running17π
WhisperKit Benchmarks
- Running on CPU Upgrade582π
Open ASR Leaderboard
- Running105π
Ocrbench Leaderboard
- Running37π
Video Generation Leaderboard
Leaderboard and arena of Video Generation models
- Running on CPU Upgrade558π
Open VLM Leaderboard
VLMEvalKit Evaluation Results Collection
- Running32π¨
MVBench Leaderboard
- Running on CPU Upgrade316π₯
Open Medical-LLM Leaderboard
- Running on CPU Upgrade504π
Open Ko-LLM Leaderboard
- Running10π
Q-Bench+ Leaderboard
- Running on CPU Upgrade101π
Open Chinese LLM Leaderboard
- Running on CPU Upgrade53ππ΅π±
Open PL LLM Leaderboard
- Running on CPU Upgrade126π₯
Hallucinations Leaderboard
- Running on CPU Upgrade64π₯
AIR-Bench Leaderboard
- Runtime error105π₯
Enterprise Scenarios Leaderboard
- Running on CPU Upgrade85π₯
LLM Safety Leaderboard
- Running on CPU Upgrade44π₯
OpenLLM Turkish leaderboard v0.2