Kimi K2 Instruct

Name: Kimi K2 Instruct
Price: 0.57 USD
Rating: 66.7 (38 reviews)
Author: Moonshot AI

Zero-eval

#1MATH-500

#1GSM8k

#1CBNSL

+23 more

by Moonshot AI

About

Kimi K2 Instruct is a language model developed by Moonshot AI. It achieves strong performance with an average score of 66.7% across 38 benchmarks. It excels particularly in MATH-500 (97.4%), GSM8k (97.3%), CBNSL (95.6%). It supports a 144K token context window for handling large documents. The model is available through 1 API provider. Released in 2025, it represents Moonshot AI's latest advancement in AI technology.

Pricing Range

Input (per 1M)$0.57 -$0.57

Output (per 1M)$2.29 -$2.29

Providers1

Timeline

AnnouncedJan 1, 2025

ReleasedJan 1, 2025

Specifications

Training Tokens15.5T

License & Family

License

Modified MIT License

Base ModelKimi K2 Base

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

38 benchmarks

Average Score

66.7%

Best Score

97.4%

High Performers (80%+)

Performance Metrics

Max Context Window

144.0K

Avg Throughput

45.0 tok/s

Avg Latency

1ms

Top Categories

reasoning

89.0%

math

86.6%

code

79.5%

roleplay

76.4%

general

62.0%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

MATH-500

Rank #1 of 22

#1Kimi K2 Instruct

97.4%

#2DeepSeek-R1

97.3%

#3Llama 3.1 Nemotron Ultra 253B v1

97.0%

#4Llama-3.3 Nemotron Super 49B v1

96.6%

GSM8k

Rank #1 of 46

#1Kimi K2 Instruct

97.3%

#2o1

97.1%

#3GPT-4.5

97.0%

#4Llama 3.1 405B Instruct

96.8%

CBNSL

Rank #1 of 1

#1Kimi K2 Instruct

95.6%

HumanEval

Rank #3 of 62

#1GPT-5

93.4%

#2Claude 3.5 Sonnet

93.7%

#3Kimi K2 Instruct

93.3%

#4Qwen2.5-Coder 32B Instruct

92.7%

#5o1-mini

92.4%

#6Claude 3.5 Sonnet

92.0%

MMLU-Redux

Rank #4 of 13

#1DeepSeek-R1

92.9%

#2Qwen3-235B-A22B-Instruct-2507

93.1%

#3DeepSeek-R1-0528

93.4%

#4Kimi K2 Instruct

92.7%

#5DeepSeek-V3

89.1%

#6Qwen3 235B A22B

87.4%

#7Qwen2.5 72B Instruct

86.8%

All Benchmark Results for Kimi K2 Instruct

Complete list of benchmark scores with detailed information


MATH-500 MATH-500 benchmark	math	text	0.97	97.4%	Self-reported
GSM8k GSM8k benchmark	math	text	0.97	97.3%	Self-reported
CBNSL CBNSL benchmark	general	text	0.96	95.6%	Self-reported
HumanEval HumanEval benchmark	code	text	0.93	93.3%	Self-reported
MMLU-Redux MMLU-Redux benchmark	general	text	0.93	92.7%	Self-reported
IFEval IFEval benchmark	code	text	0.90	89.8%	Self-reported
MMLU MMLU benchmark	general	text	0.90	89.5%	Self-reported
AutoLogi AutoLogi benchmark	general	text	0.90	89.5%	Self-reported
ZebraLogic ZebraLogic benchmark	reasoning	text	0.89	89.0%	Self-reported
MultiPL-E MultiPL-E benchmark	general	text	85.70	85.7%	Self-reported

Showing 1 to 10 of 38 benchmarks

Resources

API Reference Playground Blog Post Repository Model Weights