Winogrande
reasoning
text
About
Winogrande benchmark
Evaluation Stats
Total Models19
Organizations8
Verified Results0
Self-Reported19
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers
Score Distribution
19 models
Top Score
87.5%
Average Score
77.0%
High Performers (80%+)
9Top Organizations
#1OpenAI
1 model
87.5%
#2Cohere
1 model
85.4%
#3NVIDIA
1 model
84.5%
#4Alibaba
4 models
80.2%
#5Mistral AI
2 models
76.0%
Leaderboard
Top 19 models ranked by performance
2
85.4%
Raw: 0.854
Self-reported
85.1%
Raw: 0.851
Self-reported
84.5%
Raw: 0.8453
Self-reported
83.7%
Raw: 0.837
Self-reported
82.0%
Raw: 0.82
Self-reported
81.3%
Raw: 0.813
Self-reported
80.8%
Raw: 0.808
Self-reported
9
80.6%
Raw: 0.806
Self-reported
76.8%
Raw: 0.768
Self-reported
75.3%
Raw: 0.753
Self-reported
74.4%
Raw: 0.744
Self-reported
72.9%
Raw: 0.729
Self-reported
14
71.7%
Raw: 0.717
Self-reported
71.7%
Raw: 0.717
Self-reported
68.5%
Raw: 0.685
Self-reported
17
67.0%
Raw: 0.67
Self-reported
18
66.8%
Raw: 0.668
Self-reported
66.8%
Raw: 0.668
Self-reported