GPT-5

Name: GPT-5
Price: 1.25 USD
Rating: 70.1 (35 reviews)
Author: OpenAI

Multimodal

Zero-eval

#1SWE-Lancer (IC-Diamond subset)

#1COLLIE

#1Tau2 telecom

+30 more

by OpenAI

About

GPT-5 is a multimodal language model developed by OpenAI. It achieves strong performance with an average score of 70.1% across 35 benchmarks. It excels particularly in SWE-Lancer (IC-Diamond subset) (100.0%), COLLIE (99.0%), Tau2 telecom (96.7%). It supports a 528K token context window for handling large documents. The model is available through 2 API providers. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2025, it represents OpenAI's latest advancement in AI technology.

Pricing Range

Input (per 1M)$1.25 -$1.25

Output (per 1M)$10.00 -$10.00

Providers2

Timeline

AnnouncedAug 7, 2025

ReleasedAug 7, 2025

Knowledge CutoffSep 30, 2024

Specifications

Capabilities

Multimodal

License & Family

License

Proprietary

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

35 benchmarks

Average Score

70.1%

Best Score

100.0%

High Performers (80%+)

Performance Metrics

Max Context Window

528.0K

Avg Throughput

100.0 tok/s

Avg Latency

2ms

Top Categories

code

93.4%

long_context

90.2%

vision

79.9%

general

74.3%

math

55.5%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

SWE-Lancer (IC-Diamond subset)

Rank #1 of 4

#1GPT-5

100.0%

#2GPT-4.5

17.4%

#3GPT-4o

12.4%

#4o3-mini

7.4%

COLLIE

Rank #1 of 7

#1GPT-5

99.0%

#2o3-mini

98.7%

#3GPT-4.5

72.3%

#4GPT-4.1

65.8%

Tau2 telecom

Rank #1 of 3

#1GPT-5

96.7%

#2Kimi K2 Instruct

65.8%

#3o3

58.2%

OpenAI-MRCR: 2 needle 128k

Rank #1 of 5

#1GPT-5

95.2%

#2GPT-4.1

57.2%

#3GPT-4.1 mini

47.2%

#4GPT-4.1 nano

36.6%

AIME 2025

Rank #2 of 36

#1Grok-4 Heavy

100.0%

#2GPT-5

94.6%

#3Grok-3

93.3%

#4o4-mini

92.7%

#5Grok-4

91.7%

All Benchmark Results for GPT-5

Complete list of benchmark scores with detailed information


SWE-Lancer (IC-Diamond subset) SWE-Lancer (IC-Diamond subset) benchmark	general	text	1.00	100.0%	Self-reported
COLLIE COLLIE benchmark	general	text	0.99	99.0%	Self-reported
Tau2 telecom Tau2 telecom benchmark	general	text	0.97	96.7%	Self-reported
OpenAI-MRCR: 2 needle 128k OpenAI-MRCR: 2 needle 128k benchmark	long_context	text	0.95	95.2%	Self-reported
AIME 2025 AIME 2025 benchmark	general	text	0.95	94.6%	Self-reported
HumanEval HumanEval benchmark	code	text	0.93	93.4%	Self-reported
HMMT 2025 HMMT 2025 benchmark	general	text	0.93	93.3%	Self-reported
MMLU MMLU benchmark	general	text	0.93	92.5%	Self-reported
BrowseComp Long Context 128k BrowseComp long-context (128k) variant	long_context	text	0.90	90.0%	Self-reported
BrowseComp Long Context 256k BrowseComp long-context (256k) variant	long_context	text	0.89	88.8%	Self-reported

Showing 1 to 10 of 35 benchmarks

Resources

API Reference Playground Research Paper Blog Post