GPT-4.1 mini

Name: GPT-4.1 mini
Price: 0.4 USD
Rating: 48.4 (29 reviews)
Author: OpenAI

Multimodal

Zero-eval

#2CharXiv-D

#2Graphwalks BFS >128k

#2Graphwalks parents >128k

+3 more

by OpenAI

About

GPT-4.1 mini is a multimodal language model developed by OpenAI. The model shows competitive results across 29 benchmarks. It excels particularly in CharXiv-D (88.4%), MMLU (87.5%), IFEval (84.1%). With a 1.1M token context window, it can handle extensive documents and complex multi-turn conversations. The model is available through 2 API providers. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2025, it represents OpenAI's latest advancement in AI technology.

Pricing Range

Input (per 1M)$0.40 -$0.40

Output (per 1M)$1.60 -$1.60

Providers2

Timeline

AnnouncedApr 14, 2025

ReleasedApr 14, 2025

Knowledge CutoffMay 31, 2024

Specifications

Capabilities

Multimodal

License & Family

License

Proprietary

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

29 benchmarks

Average Score

48.4%

Best Score

88.4%

High Performers (80%+)

Performance Metrics

Max Context Window

1.1M

Avg Throughput

150.0 tok/s

Avg Latency

5ms

Top Categories

code

84.1%

math

73.1%

vision

72.7%

general

45.9%

agents

45.9%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

CharXiv-D

Rank #2 of 5

#1GPT-4.5

90.0%

#2GPT-4.1 mini

88.4%

#3GPT-4.1

87.9%

#4GPT-4o

85.3%

#5GPT-4.1 nano

73.9%

MMLU

Rank #14 of 78

#11Kimi K2 Base

87.8%

#12Qwen3 235B A22B

87.8%

#13DeepSeek-V3

88.5%

#14GPT-4.1 mini

87.5%

#15Grok-2

87.5%

#16Kimi-k1.5

87.4%

#17Llama 3.1 405B Instruct

87.3%

IFEval

Rank #21 of 37

#18Qwen2.5 72B Instruct

84.1%

#19Phi 4 Reasoning Plus

84.9%

#20DeepSeek-V3

86.1%

#21GPT-4.1 mini

84.1%

#22QwQ-32B

83.9%

#23Phi 4 Reasoning

83.4%

#24DeepSeek-R1

83.3%

MMMLU

Rank #10 of 13

#7GPT-4o

81.4%

#8GPT-4.5

85.1%

#9Claude 3.7 Sonnet

86.1%

#10GPT-4.1 mini

78.5%

#11Phi-3.5-MoE-instruct

69.9%

#12GPT-4.1 nano

66.9%

#13Phi-3.5-mini-instruct

55.4%

MathVista

Rank #5 of 35

#2Llama 4 Maverick

73.7%

#3Kimi-k1.5

74.9%

#4o4-mini

84.3%

#5GPT-4.1 mini

73.1%

#6GPT-4.5

72.3%

#7GPT-4.1

72.2%

#8o1

71.8%

All Benchmark Results for GPT-4.1 mini

Complete list of benchmark scores with detailed information


CharXiv-D CharXiv-D benchmark	general	text	0.88	88.4%	Self-reported
MMLU MMLU benchmark	general	text	0.88	87.5%	Self-reported
IFEval IFEval benchmark	code	text	0.84	84.1%	Self-reported
MMMLU MMMLU benchmark	general	text	0.79	78.5%	Self-reported
MathVista MathVista benchmark	math	text	0.73	73.1%	Self-reported
MMMU MMMU benchmark	vision	multimodal	0.73	72.7%	Self-reported
Multi-IF Multi-IF benchmark	general	text	0.67	67.0%	Self-reported
GPQA GPQA benchmark	general	text	0.65	65.0%	Self-reported
Graphwalks parents <128k Graphwalks parents <128k benchmark	general	text	0.60	60.5%	Self-reported
CharXiv-R CharXiv-R benchmark	general	text	0.57	56.8%	Self-reported

Showing 1 to 10 of 29 benchmarks

Resources

API Reference Playground Blog Post