Grok-3 Mini

Multimodal

Zero-eval

#1AIME 2024

#1LiveCodeBench

by xAI

About

Grok-3 Mini is a multimodal language model developed by xAI. This model demonstrates exceptional performance with an average score of 87.8% across 4 benchmarks. It excels particularly in AIME 2024 (95.8%), AIME 2025 (90.8%), GPQA (84.0%). The model shows particular specialization in general tasks with an average performance of 90.2%. It supports a 136K token context window for handling large documents. The model is available through 1 API provider. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2025, it represents xAI's latest advancement in AI technology.

Pricing Range

Input (per 1M)$0.30 -$0.30

Output (per 1M)$0.50 -$0.50

Providers1

Timeline

AnnouncedFeb 17, 2025

ReleasedFeb 17, 2025

Knowledge CutoffNov 17, 2024

Specifications

Capabilities

Multimodal

License & Family

License

Proprietary

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

4 benchmarks

Average Score

87.8%

Best Score

95.8%

High Performers (80%+)

Performance Metrics

Max Context Window

136.0K

Avg Throughput

100.0 tok/s

Avg Latency

1ms

Top Categories

general

90.2%

code

80.4%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

AIME 2024

Rank #1 of 41

#1Grok-3 Mini

95.8%

#2o4-mini

93.4%

#3Grok-3

93.3%

#4Gemini 2.5 Pro

92.0%

AIME 2025

Rank #7 of 36

#4GPT-5 mini

91.1%

#5Grok-4

91.7%

#6o4-mini

92.7%

#7Grok-3 Mini

90.8%

#8Gemini 2.5 Pro Preview 06-05

88.0%

#9DeepSeek-R1-0528

87.5%

#10o3

86.4%

GPQA

Rank #7 of 115

#4Grok-3

84.6%

#5Claude 3.7 Sonnet

84.8%

#6GPT-5

85.7%

#7Grok-3 Mini

84.0%

#8o3

83.3%

#9Gemini 2.5 Pro

83.0%

#10Gemini 2.5 Flash

82.8%

LiveCodeBench

Rank #1 of 44

#1Grok-3 Mini

80.4%

#2Grok-4 Heavy

79.4%

#3Grok-3

79.4%

#4Grok-4

79.0%

All Benchmark Results for Grok-3 Mini

Complete list of benchmark scores with detailed information


AIME 2024 AIME 2024 benchmark	general	text	0.96	95.8%	Self-reported
AIME 2025 AIME 2025 benchmark	general	text	0.91	90.8%	Self-reported
GPQA GPQA benchmark	general	text	0.84	84.0%	Self-reported
LiveCodeBench LiveCodeBench benchmark	code	text	0.80	80.4%	Self-reported

Resources

API Reference