All Benchmarks
Explore all 336 benchmarks for evaluating language models across different capabilities and domains
336
Total Benchmarks
0
Verified
10
Categories
1
With Sub-benchmarks
Properties | Links | ||||||
---|---|---|---|---|---|---|---|
general | text en | 115 | 13 | 88.4% | 54.6% | ||
general | text en | 78 | 15 | 92.5% | 79.1% | ||
math | text en | 63 | 11 | 97.9% | 66.7% | ||
code | text en | 62 | 12 | 93.7% | 80.4% | ||
general | text en | 60 | 11 | 85.0% | 63.3% | ||
vision | multimodal en | 52 | 11 | 84.2% | 64.1% | ||
math | text en | 46 | 15 | 97.3% | 87.8% | ||
code | text en | 44 | 8 | 80.4% | 44.8% | ||
general | text en | 41 | 10 | 95.8% | 72.8% | ||
code | text en | 37 | 12 | 93.9% | 83.2% |
Showing 1 to 10 of 336 benchmarks
...