GPT OSS 120B

Name: GPT OSS 120B
Price: 0.15 USD
Rating: 63.1 (2 reviews)
Author: OpenAI

Multimodal

Zero-eval

by OpenAI

About

GPT OSS 120B is a multimodal language model developed by OpenAI. It achieves strong performance with an average score of 63.1% across 2 benchmarks. Notable strengths include GPQA (71.5%), MMLU (54.8%). It supports a 161K token context window for handling large documents. The model is available through 1 API provider. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents OpenAI's latest advancement in AI technology.

Pricing Range

Input (per 1M)$0.15 -$0.15

Output (per 1M)$0.60 -$0.60

Providers1

Timeline

AnnouncedAug 5, 2025

ReleasedAug 5, 2025

Specifications

Capabilities

Multimodal

License & Family

License

Apache 2.0

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

2 benchmarks

Average Score

63.1%

Best Score

71.5%

High Performers (80%+)

Performance Metrics

Max Context Window

161.0K

Avg Throughput

500.0 tok/s

Avg Latency

1ms

Top Categories

general

63.1%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

GPQA

Rank #25 of 115

#22o1-preview

73.3%

#23DeepSeek R1 Zero

73.3%

#24Gemini 2.0 Flash Thinking

74.2%

#25GPT OSS 120B

71.5%

#26DeepSeek-R1

71.5%

#27GPT-5 nano

71.2%

#28Magistral Medium

70.8%

MMLU

Rank #78 of 78

#75Gemma 3n E2B Instructed

60.1%

#76Gemma 3n E2B Instructed LiteRT (Preview)

60.1%

#77IBM Granite 4.0 Tiny Preview

60.4%

#78GPT OSS 120B

54.8%

All Benchmark Results for GPT OSS 120B

Complete list of benchmark scores with detailed information


GPQA GPQA benchmark	general	text	0.71	71.5%	Self-reported
MMLU MMLU benchmark	general	text	0.55	54.8%	Self-reported

Resources

Playground Blog Post Repository Model Weights