CFEval

code

text

About

CFEval benchmark for evaluating code generation and problem-solving capabilities

Evaluation Stats

Total Models2

Organizations1

Verified Results0

Self-Reported2

Benchmark Details

Max Score10000

Language

Performance Overview

Score distribution and top performers

2 models

Top Score

21.3%

Average Score

21.0%

High Performers (80%+)

#1Alibaba Cloud / Qwen Team

2 models

21.0%

Leaderboard

2 models ranked by performance on CFEval

			License		Links
Qwen3-235B-A22B-Thinking-2507	Alibaba Cloud / Qwen Team	Jul 25, 2025	Apache 2.0	21.3%
Qwen3-Next-80B-A3B-Thinking	Alibaba Cloud / Qwen Team	Jan 10, 2025	Apache 2.0	20.7%