OpenCompass

community

https://opencompass.org.cn/

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

TracyMc updated a Space 9 days ago

opencompass/Compass_Academic_Leaderboard

KennyUTC updated a Space 13 days ago

opencompass/open_vlm_leaderboard

ZwwWayne authored a paper 17 days ago

Are Your LLMs Capable of Stable Reasoning?

View all activity

Organization Card

Community About org cards

OpenCompass Website ^HOT OpenCompass Toolkit ^{TRY IT OUT}

👋 join us on Discord and WeChat

follow us on Github

OpenCompass is a platform focused on evaluation of AGI, include Large Language Model and Multi-modality Model. We aim to:

develop high-quality libraries to reduce the difficulties in evaluation
provide convincing leaderboards for improving the understanding of the large models
create powerful toolchains targeting a variety of abilities and tasks
build solid benchmarks to support the large model research

Collections 1

spaces 10

Compass Academic Leaderboard

Compass Academic Leaderboard

Running on CPU Upgrade

Open VLM Leaderboard

VLMEvalKit Evaluation Results Collection

Open LMM Reasoning Leaderboard

A Leaderboard that demonstrates LMM reasoning capabilities

Open VLM Video Leaderboard

VLMEvalKit Eval Results in video understanding benchmark

CompassJudger Subjective Evaluation Learderboard

CompassJudger Subjective Evaluation Learderboard

JudgerBench Leaderboard

JudgerBench Leaderboard

models 8

opencompass/anah-v2

Text Generation • Updated 24 days ago • 60 • 2

opencompass/CompassJudger-1-14B-Instruct

Text Generation • Updated Oct 30, 2024 • 103 • 2

opencompass/CompassJudger-1-32B-Instruct

Text Generation • Updated Oct 30, 2024 • 179 • 13

opencompass/CompassJudger-1-1.5B-Instruct

Updated Oct 22, 2024 • 144 • 1

opencompass/CompassJudger-1-7B-Instruct

Updated Oct 22, 2024 • 1.25k • 6

opencompass/anah-7b

Text Generation • Updated Jul 3, 2024 • 17

opencompass/anah-20b

Text Generation • Updated Jul 3, 2024 • 17

opencompass/mixtral-8x7b-32k

Updated Dec 10, 2023 • 1

datasets 7

opencompass/mmmlu_lite

Viewer • Updated Nov 1, 2024 • 20k • 51 • 2

opencompass/MMBench-Video

Preview • Updated Oct 9, 2024 • 298 • 7

opencompass/NeedleBench

Viewer • Updated Jul 26, 2024 • 524 • 525 • 3

opencompass/anah

Viewer • Updated Jul 3, 2024 • 783 • 51 • 2

opencompass/flames

Viewer • Updated Apr 22, 2024 • 537 • 37

opencompass/CriticBench

Updated Feb 23, 2024 • 148 • 4

opencompass/MMBench

Updated Sep 13, 2023 • 35 • 1