RealWorldQA

general
text
About

RealWorldQA benchmark

Evaluation Stats
Total Models6
Organizations3
Verified Results0
Self-Reported6
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers

Score Distribution

6 models
Top Score
77.8%
Average Score
69.1%
High Performers (80%+)
0

Top Organizations

#1Alibaba
2 models
74.0%
#2xAI
1 model
68.7%
#3DeepSeek
3 models
66.0%
Leaderboard
Top 6 models ranked by performance
77.8%
Raw: 0.778
Self-reported
70.3%
Raw: 0.703
Self-reported
68.7%
Raw: 0.687
Self-reported
68.4%
Raw: 0.684
Self-reported
65.4%
Raw: 0.654
Self-reported
64.2%
Raw: 0.642
Self-reported