
Kimi K2-Instruct-0905
Zero-eval
#1Multi-Challenge
#2AutoLogi
#2AceBench
+7 more
by Moonshot AI
About
Kimi K2-Instruct-0905 is a language model developed by Moonshot AI. It achieves strong performance with an average score of 64.1% across 29 benchmarks. It excels particularly in MATH-500 (97.4%), MMLU-Redux (92.7%), IFEval (89.8%). It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Moonshot AI's latest advancement in AI technology.
Timeline
AnnouncedSep 5, 2025
ReleasedSep 5, 2025
Specifications
Training Tokens15.5T
License & Family
License
MIT
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
29 benchmarks
Average Score
64.1%
Best Score
97.4%
High Performers (80%+)
8Top Categories
general
69.9%
agents
67.3%
math
65.0%
factuality
61.9%
reasoning
60.1%
Additional Information
Content coming soon...
All Benchmark Results for Kimi K2-Instruct-0905
Complete list of benchmark scores with detailed information
MATH-500 | math | text | 0.97 | 97.4% | Self-reported |
MMLU-Redux | factuality | text | 0.93 | 92.7% | Self-reported |
IFEval | code | text | 0.90 | 89.8% | Self-reported |
MMLU | general | text | 0.90 | 89.5% | Self-reported |
AutoLogi | reasoning | text | 0.90 | 89.5% | Self-reported |
ZebraLogic | reasoning | text | 0.89 | 89.0% | Self-reported |
MultiPL-E | code | text | 0.86 | 85.7% | Self-reported |
MMLU-Pro | general | text | 0.81 | 81.1% | Self-reported |
AceBench | agents | text | 0.77 | 76.5% | Self-reported |
LiveBench | general | text | 0.76 | 76.4% | Self-reported |
Showing 1 to 10 of 29 benchmarks