🚀 Website under development • Launching soon

OpenAI

GPT-5 Codex

#2SWE-Bench Verified

by OpenAI

About

GPT-5 Codex is a language model developed by OpenAI. It achieves strong performance with an average score of 74.5% across 1 benchmarks. Notable strengths include SWE-Bench Verified (74.5%). Released in 2025, it represents OpenAI's latest advancement in AI technology.

Timeline
AnnouncedSep 15, 2025
ReleasedSep 15, 2025
Knowledge CutoffSep 30, 2024
Specifications
License & Family
License
Proprietary
Benchmark Performance Overview
Performance metrics and category breakdown

Overall Performance

1 benchmarks
Average Score
74.5%
Best Score
74.5%
High Performers (80%+)
0

Top Categories

general
74.5%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark

SWE-Bench Verified

Rank #2 of 33
#1GPT-5
74.9%
#2GPT-5 Codex
74.5%
#3Claude Opus 4.1
74.5%
#4Claude Sonnet 4
72.7%
#5Claude Opus 4
72.5%
All Benchmark Results for GPT-5 Codex
Complete list of benchmark scores with detailed information
SWE-Bench Verified
SWE-bench-verified is a human-validated subset of the original SWE-bench featuring 500 carefully verified samples for evaluating AI models' software engineering capabilities. This rigorous benchmark tests models' ability to generate patches that resolve real GitHub issues, focusing on bug fixing, code generation, and software development tasks with improved reliability and reduced evaluation noise.
general
text
0.74
74.5%
Self-reported