BoolQ
general
text
About
BoolQ benchmark
Evaluation Stats
Total Models9
Organizations2
Verified Results0
Self-Reported9
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers
Score Distribution
9 models
Top Score
84.8%
Average Score
81.0%
High Performers (80%+)
6Top Organizations
#1Microsoft
3 models
81.3%
#2Google
6 models
80.8%
Leaderboard
Top 9 models ranked by performance
84.8%
Raw: 0.848
Self-reported
84.6%
Raw: 0.846
Self-reported
3
84.2%
Raw: 0.842
Self-reported
81.6%
Raw: 0.816
Self-reported
81.6%
Raw: 0.816
Self-reported
81.2%
Raw: 0.812
Self-reported
78.0%
Raw: 0.78
Self-reported
76.4%
Raw: 0.764
Self-reported
76.4%
Raw: 0.764
Self-reported