Alibaba

QwQ-32B-Preview

Zero-eval

by Alibaba

About

QwQ-32B-Preview is a language model developed by Alibaba. It achieves strong performance with an average score of 63.9% across 4 benchmarks. It excels particularly in MATH-500 (90.6%), GPQA (65.2%), AIME 2024 (50.0%). The model is available through 4 API providers. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Alibaba's latest advancement in AI technology.

Pricing Range
Input (per 1M)$0.15 -$1.20
Output (per 1M)$0.20 -$1.20
Providers4
Timeline
AnnouncedNov 28, 2024
ReleasedNov 28, 2024
Knowledge CutoffNov 28, 2024
Specifications
License & Family
License
Apache 2.0
Base ModelQwen2.5 32B Instruct
Benchmark Performance Overview
Performance metrics and category breakdown

Overall Performance

4 benchmarks
Average Score
63.9%
Best Score
90.6%
High Performers (80%+)
1

Performance Metrics

Max Context Window
65.5K
Avg Throughput
67.3 tok/s
Avg Latency
1ms

Top Categories

math
90.6%
general
57.6%
code
50.0%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark

MATH-500

Rank #16 of 22
#13QwQ-32B
90.6%
#14DeepSeek R1 Distill Qwen 7B
92.8%
#15DeepSeek R1 Distill Qwen 14B
93.9%
#16QwQ-32B-Preview
90.6%
#17DeepSeek-V3
90.2%
#18o1-mini
90.0%
#19DeepSeek R1 Distill Llama 8B
89.1%

GPQA

Rank #40 of 115
#37Phi 4 Reasoning
65.8%
#38Qwen3 30B A3B
65.8%
#39GPT-4.1
66.3%
#40QwQ-32B-Preview
65.2%
#41DeepSeek R1 Distill Llama 70B
65.2%
#42QwQ-32B
65.2%
#43GPT-4.1 mini
65.0%

AIME 2024

Rank #34 of 41
#31DeepSeek R1 Distill Qwen 1.5B
52.7%
#32DeepSeek-V3 0324
59.4%
#33Kimi K2 Instruct
69.6%
#34QwQ-32B-Preview
50.0%
#35GPT-4.1 mini
49.6%
#36GPT-4.1
48.1%
#37o1-preview
42.0%

LiveCodeBench

Rank #22 of 44
#19DeepSeek R1 Zero
50.0%
#20Magistral Medium
50.3%
#21Magistral Small 2506
51.3%
#22QwQ-32B-Preview
50.0%
#23DeepSeek-V3 0324
49.2%
#24Llama 4 Maverick
43.4%
#25DeepSeek R1 Distill Llama 8B
39.6%
All Benchmark Results for QwQ-32B-Preview
Complete list of benchmark scores with detailed information
MATH-500
MATH-500 benchmark
math
text
0.91
90.6%
Self-reported
GPQA
GPQA benchmark
general
text
0.65
65.2%
Self-reported
AIME 2024
AIME 2024 benchmark
general
text
0.50
50.0%
Self-reported
LiveCodeBench
LiveCodeBench benchmark
code
text
0.50
50.0%
Self-reported