TruthfulQA
factuality
text
About
TruthfulQA benchmark
Evaluation Stats
Total Models16
Organizations7
Verified Results0
Self-Reported16
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers
Score Distribution
16 models
Top Score
77.5%
Average Score
58.7%
High Performers (80%+)
0Top Organizations
#1Microsoft
3 models
69.3%
#2IBM
3 models
59.0%
#3NVIDIA
1 model
58.6%
#4Cohere
1 model
56.3%
#5AI21 Labs
2 models
56.2%
Leaderboard
Top 16 models ranked by performance
77.5%
Raw: 0.775
Self-reported
66.9%
Raw: 0.6686
Self-reported
66.4%
Raw: 0.664
Self-reported
64.0%
Raw: 0.64
Self-reported
58.6%
Raw: 0.5863
Self-reported
58.4%
Raw: 0.584
Self-reported
58.3%
Raw: 0.583
Self-reported
58.1%
Raw: 0.581
Self-reported
57.8%
Raw: 0.578
Self-reported
10
56.3%
Raw: 0.563
Self-reported
54.8%
Raw: 0.548
Self-reported
54.2%
Raw: 0.542
Self-reported
54.1%
Raw: 0.541
Self-reported
52.1%
Raw: 0.5215
Self-reported
50.6%
Raw: 0.506
Self-reported
50.3%
Raw: 0.503
Self-reported