MM-MT-Bench
roleplay
text
About
MM-MT-Bench benchmark
Evaluation Stats
Total Models3
Organizations2
Verified Results0
Self-Reported3
Benchmark Details
Max Score100
Language
en
Performance Overview
Score distribution and top performers
Score Distribution
3 models
Top Score
74.0%
Average Score
46.8%
High Performers (80%+)
0Top Organizations
#1Mistral AI
2 models
67.3%
#2Alibaba
1 model
6.0%
Leaderboard
Top 3 models ranked by performance
74.0%
Raw: 74
Self-reported
60.5%
Raw: 60.5
Self-reported
6.0%
Raw: 6
Self-reported