MMMLU
general
text
About
MMMLU benchmark
Evaluation Stats
Total Models13
Organizations4
Verified Results0
Self-Reported13
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers
Score Distribution
13 models
Top Score
98.4%
Average Score
81.4%
High Performers (80%+)
9Top Organizations
#1Anthropic
4 models
90.0%
#2Alibaba
1 model
86.7%
#3OpenAI
6 models
81.2%
#4Microsoft
2 models
62.7%
Leaderboard
Top 13 models ranked by performance
98.4%
Raw: 0.984
Self-reported
88.8%
Raw: 0.888
Self-reported
86.7%
Raw: 0.867
Self-reported
86.5%
Raw: 0.865
Self-reported
86.1%
Raw: 0.861
Self-reported
10
78.5%
Raw: 0.785
Self-reported
69.9%
Raw: 0.699
Self-reported
12
66.9%
Raw: 0.669
Self-reported
55.4%
Raw: 0.554
Self-reported