IFEval

code
text
About

IFEval benchmark

Evaluation Stats
Total Models37
Organizations12
Verified Results0
Self-Reported37
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers

Score Distribution

37 models
Top Score
93.9%
Average Score
83.2%
High Performers (80%+)
28

Top Organizations

#1Anthropic
1 model
93.2%
#2Amazon
3 models
89.7%
#3Moonshot AI
2 models
88.5%
#4Google
4 models
87.4%
#5Meta
5 models
85.2%
Leaderboard
Top 20 models ranked by performance
93.9%
Raw: 0.939
Self-reported
93.2%
Raw: 0.932
Self-reported
92.1%
Raw: 0.921
Self-reported
92.1%
Raw: 0.921
Self-reported
90.4%
Raw: 0.904
Self-reported
90.2%
Raw: 0.902
Self-reported
89.8%
Raw: 0.898
Self-reported
89.7%
Raw: 0.897
Self-reported
89.5%
Raw: 0.8945
Self-reported
88.9%
Raw: 0.889
Self-reported
88.7%
Raw: 0.887
Self-reported
88.6%
Raw: 0.886
Self-reported
88.2%
Raw: 0.882
Self-reported
87.5%
Raw: 0.875
Self-reported
87.4%
Raw: 0.874
Self-reported
87.2%
Raw: 0.872
Self-reported
87.2%
Raw: 0.872
Self-reported
86.1%
Raw: 0.861
Self-reported
84.9%
Raw: 0.849
Self-reported
84.1%
Raw: 0.841
Self-reported
Showing top 20 of 37 models