
Qwen3 32B
Zero-eval
#1CodeForces
#1MultiLF
#2Arena Hard
by Alibaba
About
Qwen3 32B is a language model developed by Alibaba. It achieves strong performance with an average score of 75.3% across 9 benchmarks. It excels particularly in CodeForces (95.2%), Arena Hard (93.8%), AIME 2024 (81.4%). The model shows particular specialization in code tasks with an average performance of 80.4%. It supports a 256K token context window for handling large documents. The model is available through 3 API providers. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Alibaba's latest advancement in AI technology.
Pricing Range
Input (per 1M)$0.10 -$0.40
Output (per 1M)$0.30 -$0.80
Providers3
Timeline
AnnouncedApr 29, 2025
ReleasedApr 29, 2025
Specifications
License & Family
License
Apache 2.0
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
9 benchmarks
Average Score
75.3%
Best Score
95.2%
High Performers (80%+)
3Performance Metrics
Max Context Window
256.0KAvg Throughput
129.0 tok/sAvg Latency
1msTop Categories
code
80.4%
roleplay
74.9%
general
73.6%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark
CodeForces
Rank #1 of 1
#1Qwen3 32B
95.2%
Arena Hard
Rank #2 of 22
#1Qwen3 235B A22B
95.6%
#2Qwen3 32B
93.8%
#3DeepSeek-R1
92.3%
#4Qwen3 30B A3B
91.0%
#5Llama-3.3 Nemotron Super 49B v1
88.3%
AIME 2024
Rank #15 of 41
#12DeepSeek R1 Distill Qwen 32B
83.3%
#13DeepSeek R1 Distill Qwen 7B
83.3%
#14Qwen3 235B A22B
85.7%
#15Qwen3 32B
81.4%
#16Phi 4 Reasoning Plus
81.3%
#17Granite 3.3 8B Instruct
81.2%
#18Granite 3.3 8B Base
81.2%
LiveBench
Rank #4 of 12
#1Kimi K2 Instruct
76.4%
#2Qwen3 235B A22B
77.1%
#3o3-mini
84.6%
#4Qwen3 32B
74.9%
#5Qwen3 30B A3B
74.3%
#6QwQ-32B
73.1%
#7o1
67.0%
MultiLF
Rank #1 of 2
#1Qwen3 32B
73.0%
#2Qwen3 235B A22B
71.9%
All Benchmark Results for Qwen3 32B
Complete list of benchmark scores with detailed information
CodeForces CodeForces benchmark | code | text | 1977.00 | 95.2% | Self-reported |
Arena Hard Arena Hard benchmark | general | text | 0.94 | 93.8% | Self-reported |
AIME 2024 AIME 2024 benchmark | general | text | 0.81 | 81.4% | Self-reported |
LiveBench LiveBench benchmark | roleplay | text | 0.75 | 74.9% | Self-reported |
MultiLF MultiLF benchmark | general | text | 0.73 | 73.0% | Self-reported |
AIME 2025 AIME 2025 benchmark | general | text | 0.73 | 72.9% | Self-reported |
BFCL BFCL benchmark | general | text | 0.70 | 70.3% | Self-reported |
LiveCodeBench LiveCodeBench benchmark | code | text | 0.66 | 65.7% | Self-reported |
Aider Aider benchmark | general | text | 0.50 | 50.2% | Self-reported |