AGIEval

code
text
About

AGIEval benchmark

Evaluation Stats
Total Models5
Organizations3
Verified Results0
Self-Reported5
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers

Score Distribution

5 models
Top Score
65.8%
Average Score
54.3%
High Performers (80%+)
0

Top Organizations

#1Mistral AI
2 models
57.0%
#2Google
2 models
54.0%
#3IBM
1 model
49.3%
Leaderboard
Top 5 models ranked by performance
65.8%
Raw: 0.658
Self-reported
55.1%
Raw: 0.551
Self-reported
52.8%
Raw: 0.528
Self-reported
49.3%
Raw: 0.493
Self-reported
48.3%
Raw: 0.483
Self-reported