LiveBench
roleplay
text
About
LiveBench benchmark
Evaluation Stats
Total Models12
Organizations4
Verified Results0
Self-Reported12
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers
Score Distribution
12 models
Top Score
84.6%
Average Score
62.1%
High Performers (80%+)
1Top Organizations
#1Moonshot AI
1 model
76.4%
#2OpenAI
3 models
68.0%
#3Alibaba
7 models
59.6%
#4Microsoft
1 model
47.6%
Leaderboard
Top 12 models ranked by performance
77.1%
Raw: 0.771
Self-reported
76.4%
Raw: 0.764
Self-reported
74.3%
Raw: 0.743
Self-reported
8
52.3%
Raw: 0.523
Self-reported
52.3%
Raw: 0.523
Self-reported
35.9%
Raw: 0.359
Self-reported
29.6%
Raw: 0.296
Self-reported