DeepSeek

DeepSeek R1 Distill Qwen 1.5B

Zero-eval

by DeepSeek

About

DeepSeek R1 Distill Qwen 1.5B is a language model developed by DeepSeek. The model shows competitive results across 4 benchmarks. It excels particularly in MATH-500 (83.9%), AIME 2024 (52.7%), GPQA (33.8%). It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents DeepSeek's latest advancement in AI technology.

Timeline
AnnouncedJan 20, 2025
ReleasedJan 20, 2025
Specifications
Training Tokens14.8T
License & Family
License
MIT
Benchmark Performance Overview
Performance metrics and category breakdown

Overall Performance

4 benchmarks
Average Score
46.8%
Best Score
83.9%
High Performers (80%+)
1

Top Categories

math
83.9%
general
43.3%
code
16.9%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark

MATH-500

Rank #20 of 22
#17DeepSeek R1 Distill Llama 8B
89.1%
#18o1-mini
90.0%
#19DeepSeek-V3
90.2%
#20DeepSeek R1 Distill Qwen 1.5B
83.9%
#21Granite 3.3 8B Base
69.0%
#22Granite 3.3 8B Instruct
69.0%

AIME 2024

Rank #33 of 41
#30DeepSeek-V3 0324
59.4%
#31Kimi K2 Instruct
69.6%
#32Magistral Small 2506
70.7%
#33DeepSeek R1 Distill Qwen 1.5B
52.7%
#34QwQ-32B-Preview
50.0%
#35GPT-4.1 mini
49.6%
#36GPT-4.1
48.1%

GPQA

Rank #97 of 115
#94Mistral Small 3 24B Base
34.4%
#95GPT-4
35.7%
#96Grok-1.5
35.9%
#97DeepSeek R1 Distill Qwen 1.5B
33.8%
#98Claude 3 Haiku
33.3%
#99Llama 3.2 3B Instruct
32.8%
#100Llama 3.2 11B Instruct
32.8%

LiveCodeBench

Rank #38 of 44
#35Qwen2.5-Coder 7B Instruct
18.2%
#36Gemma 3 12B
24.6%
#37Qwen2 7B Instruct
26.6%
#38DeepSeek R1 Distill Qwen 1.5B
16.9%
#39Gemma 3n E4B Instructed
13.2%
#40Gemma 3n E2B Instructed LiteRT (Preview)
13.2%
#41Gemma 3n E4B Instructed LiteRT Preview
13.2%
All Benchmark Results for DeepSeek R1 Distill Qwen 1.5B
Complete list of benchmark scores with detailed information
MATH-500
MATH-500 benchmark
math
text
0.84
83.9%
Self-reported
AIME 2024
AIME 2024 benchmark
general
text
0.53
52.7%
Self-reported
GPQA
GPQA benchmark
general
text
0.34
33.8%
Self-reported
LiveCodeBench
LiveCodeBench benchmark
code
text
0.17
16.9%
Self-reported