GPT-4 Turbo

Name: GPT-4 Turbo
Price: 10 USD
Rating: 78.1 (6 reviews)
Author: OpenAI

Zero-eval

by OpenAI

About

GPT-4 Turbo is a language model developed by OpenAI. It achieves strong performance with an average score of 78.1% across 6 benchmarks. It excels particularly in MGSM (88.5%), HumanEval (87.1%), MMLU (86.5%). It supports a 132K token context window for handling large documents. The model is available through 2 API providers. Released in 2024, it represents OpenAI's latest advancement in AI technology.

Pricing Range

Input (per 1M)$10.00 -$10.00

Output (per 1M)$30.00 -$30.00

Providers2

Timeline

AnnouncedApr 9, 2024

ReleasedApr 9, 2024

Knowledge CutoffDec 31, 2023

Specifications

License & Family

License

Proprietary

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

6 benchmarks

Average Score

78.1%

Best Score

88.5%

High Performers (80%+)

Performance Metrics

Max Context Window

132.1K

Avg Throughput

98.5 tok/s

Avg Latency

1ms

Top Categories

code

87.1%

math

80.5%

general

73.5%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

MGSM

Rank #11 of 31

#8o1

89.3%

#9GPT-4o

90.5%

#10Llama 4 Scout

90.6%

#11GPT-4 Turbo

88.5%

#12Gemini 1.5 Pro

87.5%

#13GPT-4o mini

87.0%

#14Llama 3.2 90B Instruct

86.9%

HumanEval

Rank #26 of 62

#23GPT-4o mini

87.2%

#24Gemma 3 27B

87.8%

#25GPT-4.5

88.0%

#26GPT-4 Turbo

87.1%

#27Qwen2.5 72B Instruct

86.6%

#28Qwen2 72B Instruct

86.0%

#29Grok-2 mini

85.7%

MMLU

Rank #20 of 78

#17Claude 3 Opus

86.8%

#18o3-mini

86.9%

#19Llama 3.1 405B Instruct

87.3%

#20GPT-4 Turbo

86.5%

#21GPT-4

86.4%

#22Grok-2 mini

86.2%

#23Llama 3.2 90B Instruct

86.0%

DROP

Rank #5 of 28

#2Claude 3.5 Sonnet

87.1%

#3Claude 3.5 Sonnet

87.1%

#4DeepSeek-V3

91.6%

#5GPT-4 Turbo

86.0%

#6Nova Pro

85.4%

#7Llama 3.1 405B Instruct

84.8%

#8GPT-4o

83.4%

MATH

Rank #27 of 63

#24Grok-2 mini

73.0%

#25Nova Lite

73.3%

#26Llama 3.1 405B Instruct

73.8%

#27GPT-4 Turbo

72.6%

#28Qwen3 235B A22B

71.8%

#29Qwen2.5-Omni-7B

71.5%

#30Claude 3.5 Sonnet

71.1%

All Benchmark Results for GPT-4 Turbo

Complete list of benchmark scores with detailed information


MGSM MGSM benchmark	math	text	0.89	88.5%	Self-reported
HumanEval HumanEval benchmark	code	text	0.87	87.1%	Self-reported
MMLU MMLU benchmark	general	text	0.86	86.5%	Self-reported
DROP DROP benchmark	general	text	0.86	86.0%	Self-reported
MATH MATH benchmark	math	text	0.73	72.6%	Self-reported
GPQA GPQA benchmark	general	text	0.48	48.0%	Self-reported

Resources

API Reference Playground Blog Post