MBPP

code
text
About

MBPP benchmark

Evaluation Stats
Total Models31
Organizations6
Verified Results0
Self-Reported31
Benchmark Details
Max Score100
Language
en
Performance Overview
Score distribution and top performers

Score Distribution

31 models
Top Score
91.3%
Average Score
73.0%
High Performers (80%+)
11

Top Organizations

#1NVIDIA
2 models
87.9%
#2Alibaba
11 models
81.2%
#3Microsoft
2 models
75.2%
#4Mistral AI
3 models
74.2%
#5Meta
2 models
72.7%
Leaderboard
Top 20 models ranked by performance
91.3%
Raw: 91.3
Self-reported
90.2%
Raw: 90.2
Self-reported
88.2%
Raw: 88.2
Self-reported
84.6%
Raw: 84.6
Self-reported
84.0%
Raw: 84
Self-reported
84.0%
Raw: 84
Self-reported
83.5%
Raw: 83.5
Self-reported
82.0%
Raw: 82
Self-reported
81.4%
Raw: 81.4
Self-reported
80.8%
Raw: 80.80000000000001
Self-reported
80.2%
Raw: 80.2
Self-reported
79.2%
Raw: 79.2
Self-reported
78.2%
Raw: 78.2
Self-reported
77.6%
Raw: 77.60000000000001
Self-reported
76.0%
Raw: 76
Self-reported
74.7%
Raw: 74.71
Self-reported
74.4%
Raw: 74.4
Self-reported
73.2%
Raw: 73.2
Self-reported
73.0%
Raw: 73
Self-reported
69.6%
Raw: 69.64
Self-reported
Showing top 20 of 31 models