MM-MT-Bench

roleplay

text

About

MM-MT-Bench benchmark

Evaluation Stats

Total Models3

Organizations2

Verified Results0

Self-Reported3

Benchmark Details

Max Score100

Language

en

Performance Overview

Score distribution and top performers

Score Distribution

3 models

Top Score

74.0%

Average Score

46.8%

High Performers (80%+)

0

Top Organizations

#1Mistral AI

2 models

67.3%

#2Alibaba

1 model

6.0%

Leaderboard

Top 3 models ranked by performance

1

74.0%

Raw: 74

Self-reported

2

60.5%

Raw: 60.5

Self-reported

3

Qwen2.5-Omni-7B

6.0%

Raw: 6

Self-reported