
DeepSeek R1 Distill Qwen 1.5B
Zero-eval
by DeepSeek
About
DeepSeek R1 Distill Qwen 1.5B is a language model developed by DeepSeek. The model shows competitive results across 4 benchmarks. It excels particularly in MATH-500 (83.9%), AIME 2024 (52.7%), GPQA (33.8%). It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents DeepSeek's latest advancement in AI technology.
Timeline
AnnouncedJan 20, 2025
ReleasedJan 20, 2025
Specifications
Training Tokens14.8T
License & Family
License
MIT
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
4 benchmarks
Average Score
46.8%
Best Score
83.9%
High Performers (80%+)
1Top Categories
math
83.9%
general
43.3%
code
16.9%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark
MATH-500
Rank #20 of 22
#17DeepSeek R1 Distill Llama 8B
89.1%
#18o1-mini
90.0%
#19DeepSeek-V3
90.2%
#20DeepSeek R1 Distill Qwen 1.5B
83.9%
#21Granite 3.3 8B Base
69.0%
#22Granite 3.3 8B Instruct
69.0%
AIME 2024
Rank #33 of 41
#30DeepSeek-V3 0324
59.4%
#31Kimi K2 Instruct
69.6%
#32Magistral Small 2506
70.7%
#33DeepSeek R1 Distill Qwen 1.5B
52.7%
#34QwQ-32B-Preview
50.0%
#35GPT-4.1 mini
49.6%
#36GPT-4.1
48.1%
GPQA
Rank #97 of 115
#94Mistral Small 3 24B Base
34.4%
#95GPT-4
35.7%
#96Grok-1.5
35.9%
#97DeepSeek R1 Distill Qwen 1.5B
33.8%
#98Claude 3 Haiku
33.3%
#99Llama 3.2 3B Instruct
32.8%
#100Llama 3.2 11B Instruct
32.8%
LiveCodeBench
Rank #38 of 44
#35Qwen2.5-Coder 7B Instruct
18.2%
#36Gemma 3 12B
24.6%
#37Qwen2 7B Instruct
26.6%
#38DeepSeek R1 Distill Qwen 1.5B
16.9%
#39Gemma 3n E4B Instructed
13.2%
#40Gemma 3n E2B Instructed LiteRT (Preview)
13.2%
#41Gemma 3n E4B Instructed LiteRT Preview
13.2%
All Benchmark Results for DeepSeek R1 Distill Qwen 1.5B
Complete list of benchmark scores with detailed information
MATH-500 MATH-500 benchmark | math | text | 0.84 | 83.9% | Self-reported |
AIME 2024 AIME 2024 benchmark | general | text | 0.53 | 52.7% | Self-reported |
GPQA GPQA benchmark | general | text | 0.34 | 33.8% | Self-reported |
LiveCodeBench LiveCodeBench benchmark | code | text | 0.17 | 16.9% | Self-reported |