FrontierMath

math
text
About

FrontierMath benchmark

Evaluation Stats
Total Models6
Organizations1
Verified Results0
Self-Reported6
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers

Score Distribution

6 models
Top Score
26.3%
Average Score
14.8%
High Performers (80%+)
0

Top Organizations

#1OpenAI
6 models
14.8%
Leaderboard
Top 6 models ranked by performance
26.3%
Raw: 0.263
Self-reported
22.1%
Raw: 0.221
Self-reported
15.8%
Raw: 0.158
Self-reported
9.6%
Raw: 0.096
Self-reported
9.2%
Raw: 0.092
Self-reported
5.5%
Raw: 0.055
Self-reported