IFEval
code
text
About
IFEval benchmark
Evaluation Stats
Total Models37
Organizations12
Verified Results0
Self-Reported37
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers
Score Distribution
37 models
Top Score
93.9%
Average Score
83.2%
High Performers (80%+)
28Top Organizations
#1Anthropic
1 model
93.2%
#2Amazon
3 models
89.7%
#3Moonshot AI
2 models
88.5%
#4Google
4 models
87.4%
#5Meta
5 models
85.2%
Leaderboard
Top 20 models ranked by performance
93.2%
Raw: 0.932
Self-reported
92.1%
Raw: 0.921
Self-reported
90.4%
Raw: 0.904
Self-reported
6
90.2%
Raw: 0.902
Self-reported
89.8%
Raw: 0.898
Self-reported
89.5%
Raw: 0.8945
Self-reported
10
88.9%
Raw: 0.889
Self-reported
88.7%
Raw: 0.887
Self-reported
88.6%
Raw: 0.886
Self-reported
87.5%
Raw: 0.875
Self-reported
16
87.2%
Raw: 0.872
Self-reported
17
87.2%
Raw: 0.872
Self-reported
18
86.1%
Raw: 0.861
Self-reported
84.9%
Raw: 0.849
Self-reported
84.1%
Raw: 0.841
Self-reported
Showing top 20 of 37 models