Kimi K2-Instruct-0905

Name: Kimi K2-Instruct-0905
Rating: 64.1 (29 reviews)
Author: Moonshot AI

Zero-eval

#1Multi-Challenge

#2AutoLogi

#2AceBench

+7 more

by Moonshot AI

About

Kimi K2-Instruct-0905 is a language model developed by Moonshot AI. It achieves strong performance with an average score of 64.1% across 29 benchmarks. It excels particularly in MATH-500 (97.4%), MMLU-Redux (92.7%), IFEval (89.8%). It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Moonshot AI's latest advancement in AI technology.

Timeline

AnnouncedSep 5, 2025

ReleasedSep 5, 2025

Specifications

Training Tokens15.5T

License & Family

License

MIT

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

29 benchmarks

Average Score

64.1%

Best Score

97.4%

High Performers (80%+)

Top Categories

general

69.9%

agents

67.3%

math

65.0%

factuality

61.9%

reasoning

60.1%

Additional Information

Content coming soon...

All Benchmark Results for Kimi K2-Instruct-0905

Complete list of benchmark scores with detailed information


MATH-500	math	text	0.97	97.4%	Self-reported
MMLU-Redux	factuality	text	0.93	92.7%	Self-reported
IFEval	code	text	0.90	89.8%	Self-reported
MMLU	general	text	0.90	89.5%	Self-reported
AutoLogi	reasoning	text	0.90	89.5%	Self-reported
ZebraLogic	reasoning	text	0.89	89.0%	Self-reported
MultiPL-E	code	text	0.86	85.7%	Self-reported
MMLU-Pro	general	text	0.81	81.1%	Self-reported
AceBench	agents	text	0.77	76.5%	Self-reported
LiveBench	general	text	0.76	76.4%	Self-reported

Showing 1 to 10 of 29 benchmarks

Resources

API Reference Playground Research Paper Blog Post Repository Model Weights