DeepSeek

DeepSeek R1 Distill Qwen 14B

Zero-eval

by DeepSeek

About

DeepSeek R1 Distill Qwen 14B is a language model developed by DeepSeek. It achieves strong performance with an average score of 71.5% across 4 benchmarks. It excels particularly in MATH-500 (93.9%), AIME 2024 (80.0%), GPQA (59.1%). It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents DeepSeek's latest advancement in AI technology.

Timeline
AnnouncedJan 20, 2025
ReleasedJan 20, 2025
Specifications
Training Tokens14.8T
License & Family
License
MIT
Benchmark Performance Overview
Performance metrics and category breakdown

Overall Performance

4 benchmarks
Average Score
71.5%
Best Score
93.9%
High Performers (80%+)
2

Top Categories

math
93.9%
general
69.5%
code
53.1%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark

MATH-500

Rank #13 of 22
#10DeepSeek-V3 0324
94.0%
#11DeepSeek R1 Distill Qwen 32B
94.3%
#12DeepSeek R1 Distill Llama 70B
94.5%
#13DeepSeek R1 Distill Qwen 14B
93.9%
#14DeepSeek R1 Distill Qwen 7B
92.8%
#15QwQ-32B
90.6%
#16QwQ-32B-Preview
90.6%

AIME 2024

Rank #20 of 41
#17Qwen3 30B A3B
80.4%
#18Granite 3.3 8B Base
81.2%
#19Granite 3.3 8B Instruct
81.2%
#20DeepSeek R1 Distill Qwen 14B
80.0%
#21DeepSeek R1 Distill Llama 8B
80.0%
#22Claude 3.7 Sonnet
80.0%
#23DeepSeek-R1
79.8%

GPQA

Rank #50 of 115
#47DeepSeek-V3
59.1%
#48Claude 3.5 Sonnet
59.4%
#49o1-mini
60.0%
#50DeepSeek R1 Distill Qwen 14B
59.1%
#51Gemini 1.5 Pro
59.1%
#52Llama 4 Scout
57.2%
#53Phi 4
56.1%

LiveCodeBench

Rank #18 of 44
#15Phi 4 Reasoning Plus
53.1%
#16Phi 4 Reasoning
53.8%
#17Qwen2.5 72B Instruct
55.5%
#18DeepSeek R1 Distill Qwen 14B
53.1%
#19Magistral Small 2506
51.3%
#20Magistral Medium
50.3%
#21DeepSeek R1 Zero
50.0%
All Benchmark Results for DeepSeek R1 Distill Qwen 14B
Complete list of benchmark scores with detailed information
MATH-500
MATH-500 benchmark
math
text
0.94
93.9%
Self-reported
AIME 2024
AIME 2024 benchmark
general
text
0.80
80.0%
Self-reported
GPQA
GPQA benchmark
general
text
0.59
59.1%
Self-reported
LiveCodeBench
LiveCodeBench benchmark
code
text
0.53
53.1%
Self-reported