MMT-Bench
roleplay
text
About
MMT-Bench benchmark
Evaluation Stats
Total Models4
Organizations2
Verified Results0
Self-Reported4
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers
Score Distribution
4 models
Top Score
63.6%
Average Score
60.8%
High Performers (80%+)
0Top Organizations
#1Alibaba
1 model
63.6%
#2DeepSeek
3 models
59.9%
Leaderboard
Top 4 models ranked by performance
63.6%
Raw: 0.636
Self-reported
63.6%
Raw: 0.636
Self-reported
62.9%
Raw: 0.629
Self-reported
53.2%
Raw: 0.532
Self-reported