DROP

general
text
About

DROP benchmark

Evaluation Stats
Total Models28
Organizations8
Verified Results0
Self-Reported27
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers

Score Distribution

28 models
Top Score
92.2%
Average Score
74.0%
High Performers (80%+)
12

Top Organizations

#1DeepSeek
2 models
91.9%
#2Anthropic
6 models
82.9%
#3Amazon
3 models
81.6%
#4OpenAI
5 models
80.0%
#5Microsoft
1 model
75.5%
Leaderboard
Top 20 models ranked by performance
92.2%
Raw: 0.922
Self-reported
91.6%
Raw: 0.916
Self-reported
87.1%
Raw: 0.871
Self-reported
87.1%
Raw: 0.871
Self-reported
86.0%
Raw: 0.86
Self-reported
85.4%
Raw: 0.854
Self-reported
84.8%
Raw: 0.848
Self-reported
83.4%
Raw: 0.834
Self-reported
83.1%
Raw: 0.831
Self-reported
83.1%
Raw: 0.831
Self-reported
80.9%
Raw: 0.809
Self-reported
80.2%
Raw: 0.802
Self-reported
79.7%
Raw: 0.797
Self-reported
79.6%
Raw: 0.796
Self-reported
79.3%
Raw: 0.793
Self-reported
78.9%
Raw: 0.789
Self-reported
78.4%
Raw: 0.784
Self-reported
75.5%
Raw: 0.755
Self-reported
74.9%
Raw: 0.749
Self-reported
70.2%
Raw: 0.702
Showing top 20 of 28 models