GPT-4o mini

Name: GPT-4o mini
Price: 0.15 USD
Rating: 63.5 (9 reviews)
Author: OpenAI

Multimodal

Zero-eval

by OpenAI

About

GPT-4o mini is a multimodal language model developed by OpenAI. It achieves strong performance with an average score of 63.5% across 9 benchmarks. It excels particularly in HumanEval (87.2%), MGSM (87.0%), MMLU (82.0%). It supports a 144K token context window for handling large documents. The model is available through 1 API provider. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2024, it represents OpenAI's latest advancement in AI technology.

Pricing Range

Input (per 1M)$0.15 -$0.15

Output (per 1M)$0.60 -$0.60

Providers1

Timeline

AnnouncedJul 18, 2024

ReleasedJul 18, 2024

Knowledge CutoffOct 1, 2023

Specifications

Capabilities

Multimodal

License & Family

License

Proprietary

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

9 benchmarks

Average Score

63.5%

Best Score

87.2%

High Performers (80%+)

Performance Metrics

Max Context Window

144.4K

Avg Throughput

92.0 tok/s

Avg Latency

1ms

Top Categories

code

87.2%

math

71.3%

vision

59.4%

general

52.7%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

HumanEval

Rank #25 of 62

#22Gemma 3 27B

87.8%

#23GPT-4.5

88.0%

#24Claude 3.5 Haiku

88.1%

#25GPT-4o mini

87.2%

#26GPT-4 Turbo

87.1%

#27Qwen2.5 72B Instruct

86.6%

#28Qwen2 72B Instruct

86.0%

MGSM

Rank #13 of 31

#10Gemini 1.5 Pro

87.5%

#11GPT-4 Turbo

88.5%

#12o1

89.3%

#13GPT-4o mini

87.0%

#14Llama 3.2 90B Instruct

86.9%

#15Claude 3.5 Haiku

85.6%

#16Qwen3 235B A22B

83.5%

MMLU

Rank #35 of 78

#32Qwen2 72B Instruct

82.3%

#33Qwen2.5 32B Instruct

83.3%

#34Llama 3.1 70B Instruct

83.6%

#35GPT-4o mini

82.0%

#36Grok-1.5

81.3%

#37Jamba 1.5 Large

81.2%

#38Mistral Small 3.1 24B Base

81.0%

DROP

Rank #13 of 28

#10Nova Lite

80.2%

#11GPT-4

80.9%

#12Claude 3 Opus

83.1%

#13GPT-4o mini

79.7%

#14Llama 3.1 70B Instruct

79.6%

#15Nova Micro

79.3%

#16Claude 3 Sonnet

78.9%

MATH

Rank #32 of 63

#29Mistral Small 3 24B Instruct

70.6%

#30Claude 3.5 Sonnet

71.1%

#31Qwen2.5-Omni-7B

71.5%

#32GPT-4o mini

70.2%

#33Kimi K2 Base

70.2%

#34Mistral Small 3.2 24B Instruct

69.4%

#35Claude 3.5 Haiku

69.4%

All Benchmark Results for GPT-4o mini

Complete list of benchmark scores with detailed information


HumanEval HumanEval benchmark	code	text	0.87	87.2%	Self-reported
MGSM MGSM benchmark	math	text	0.87	87.0%	Self-reported
MMLU MMLU benchmark	general	text	0.82	82.0%	Self-reported
DROP DROP benchmark	general	text	0.80	79.7%	Self-reported
MATH MATH benchmark	math	text	0.70	70.2%	Self-reported
MMMU MMMU benchmark	vision	multimodal	0.59	59.4%	Self-reported
MathVista MathVista benchmark	math	text	0.57	56.7%	Self-reported
GPQA GPQA benchmark	general	text	0.40	40.2%	Self-reported
SWE-Bench Verified SWE-Bench Verified benchmark	general	text	0.09	8.7%	Self-reported

Resources

API Reference Blog Post