SWE-bench Verified (with parallel test-time compute)
Verified
Agentic coding
text
About
SWE-bench with parallel test-time compute optimization
Evaluation Stats
Total Models1
Organizations1
Verified Results0
Self-Reported1
Benchmark Details
Max Score100
Performance Overview
Score distribution and top performers
Score Distribution
1 models
Top Score
8200.0%
Average Score
8200.0%
High Performers (80%+)
1Top Organizations
#1Anthropic
1 model
8200.0%
Leaderboard
1 models ranked by performance on SWE-bench Verified (with parallel test-time compute)
License | Links | ||||
---|---|---|---|---|---|
Oct 22, 2024 | Proprietary | 8200.0% |