Multipl-E HumanEval

code
text
About

Multipl-E HumanEval benchmark

Evaluation Stats
Total Models3
Organizations1
Verified Results0
Self-Reported3
Benchmark Details
Max Score1
Language
en
Performance Overview
Score distribution and top performers

Score Distribution

3 models
Top Score
75.2%
Average Score
63.8%
High Performers (80%+)
0

Top Organizations

#1Meta
3 models
63.8%
Leaderboard
Top 3 models ranked by performance
75.2%
Raw: 0.752
Self-reported
65.5%
Raw: 0.655
Self-reported
50.8%
Raw: 0.508
Self-reported