MATH

math
text
About

MATH benchmark

Evaluation Stats
Total Models63
Organizations11
Verified Results0
Self-Reported61
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers

Score Distribution

63 models
Top Score
97.9%
Average Score
66.7%
High Performers (80%+)
14

Top Organizations

#1DeepSeek
1 model
74.7%
#2OpenAI
9 models
74.3%
#3Amazon
3 models
73.1%
#4Moonshot AI
1 model
70.2%
#5Alibaba
11 models
69.1%
Leaderboard
Top 20 models ranked by performance
97.9%
Raw: 0.979
Self-reported
96.4%
Raw: 0.964
Self-reported
89.7%
Raw: 0.897
Self-reported
89.0%
Raw: 0.89
Self-reported
86.8%
Raw: 0.868
Self-reported
86.5%
Raw: 0.865
Self-reported
85.5%
Raw: 0.855
Self-reported
84.7%
Raw: 0.847
Self-reported
83.8%
Raw: 0.838
Self-reported
83.1%
Raw: 0.831
Self-reported
83.1%
Raw: 0.831
Self-reported
82.2%
Raw: 0.822
Self-reported
80.4%
Raw: 0.804
Self-reported
80.0%
Raw: 0.8
Self-reported
78.3%
Raw: 0.783
Self-reported
77.9%
Raw: 0.779
Self-reported
77.0%
Raw: 0.77
Self-reported
76.6%
Raw: 0.766
Self-reported
76.6%
Raw: 0.766
Self-reported
76.1%
Raw: 0.761
Self-reported
Showing top 20 of 63 models