GPT-3.5 Turbo

Name: GPT-3.5 Turbo
Price: 0.5 USD
Rating: 42.3 (8 reviews)
Author: OpenAI

Zero-eval

by OpenAI

About

GPT-3.5 Turbo is a language model developed by OpenAI. The model shows competitive results across 8 benchmarks. Notable strengths include DROP (70.2%), MMLU (69.8%), HumanEval (68.0%). The model is available through 2 API providers.

Pricing Range

Input (per 1M)$0.50 -$0.50

Output (per 1M)$1.50 -$1.50

Providers2

Timeline

AnnouncedMar 21, 2023

ReleasedMar 21, 2023

Knowledge CutoffSep 30, 2021

Specifications

License & Family

License

Proprietary

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

8 benchmarks

Average Score

42.3%

Best Score

70.2%

High Performers (80%+)

Performance Metrics

Max Context Window

20.5K

Avg Throughput

95.0 tok/s

Avg Latency

1ms

Top Categories

code

68.0%

general

56.9%

math

33.1%

vision

0.0%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

DROP

Rank #20 of 28

#17Gemini 1.5 Pro

74.9%

#18Phi 4

75.5%

#19Claude 3 Haiku

78.4%

#20GPT-3.5 Turbo

70.2%

#21Gemma 3n E4B

60.8%

#22Gemma 3n E4B Instructed LiteRT Preview

60.8%

#23Llama 3.1 8B Instruct

59.5%

MMLU

Rank #61 of 78

#58Qwen2 7B Instruct

70.5%

#59Gemma 2 9B

71.3%

#60Gemini 1.0 Pro

71.8%

#61GPT-3.5 Turbo

69.8%

#62Jamba 1.5 Mini

69.7%

#63Llama 3.1 8B Instruct

69.4%

#64Pixtral-12B

69.2%

HumanEval

Rank #54 of 62

#51Phi-3.5-MoE-instruct

70.7%

#52Gemma 3 4B

71.3%

#53Pixtral-12B

72.0%

#54GPT-3.5 Turbo

68.0%

#55GPT-4

67.0%

#56Gemma 3n E2B Instructed LiteRT (Preview)

66.5%

#57Gemma 3n E2B Instructed

66.5%

MGSM

Rank #28 of 31

#25Llama 3.2 3B Instruct

58.2%

#26Phi-3.5-MoE-instruct

58.7%

#27Gemma 3n E4B Instructed LiteRT Preview

60.7%

#28GPT-3.5 Turbo

56.3%

#29Gemma 3n E2B Instructed LiteRT (Preview)

53.1%

#30Gemma 3n E2B Instructed

53.1%

#31Phi-3.5-mini-instruct

47.9%

MATH

Rank #57 of 63

#54Mistral Small 3 24B Base

46.0%

#55Qwen2.5-Coder 7B Instruct

46.6%

#56Gemma 3 1B

48.0%

#57GPT-3.5 Turbo

43.1%

#58Claude 3 Sonnet

43.1%

#59Gemma 2 27B

42.3%

#60GPT-4

42.0%

All Benchmark Results for GPT-3.5 Turbo

Complete list of benchmark scores with detailed information


DROP DROP benchmark	general	text	0.70	70.2%	Unverified
MMLU MMLU benchmark	general	text	0.70	69.8%	Unverified
HumanEval HumanEval benchmark	code	text	0.68	68.0%	Unverified
MGSM MGSM benchmark	math	text	0.56	56.3%	Unverified
MATH MATH benchmark	math	text	0.43	43.1%	Unverified
GPQA GPQA benchmark	general	text	0.31	30.8%	Unverified
MMMU MMMU benchmark	vision	multimodal	0.00	0.0%	Unverified
MathVista MathVista benchmark	math	text	0.00	0.0%	Unverified

Resources

API Reference Playground Blog Post