
DeepSeek R1 Distill Llama 70B
Zero-eval
by DeepSeek
About
DeepSeek R1 Distill Llama 70B is a language model developed by DeepSeek. It achieves strong performance with an average score of 76.0% across 4 benchmarks. It excels particularly in MATH-500 (94.5%), AIME 2024 (86.7%), GPQA (65.2%). It supports a 256K token context window for handling large documents. The model is available through 1 API provider. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents DeepSeek's latest advancement in AI technology.
Pricing Range
Input (per 1M)$0.10 -$0.10
Output (per 1M)$0.40 -$0.40
Providers1
Timeline
AnnouncedJan 20, 2025
ReleasedJan 20, 2025
Specifications
Training Tokens14.8T
License & Family
License
MIT
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
4 benchmarks
Average Score
76.0%
Best Score
94.5%
High Performers (80%+)
2Performance Metrics
Max Context Window
256.0KAvg Throughput
37.0 tok/sAvg Latency
1msTop Categories
math
94.5%
general
76.0%
code
57.5%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark
MATH-500
Rank #10 of 22
#7Phi 4 Mini Reasoning
94.6%
#8Llama 3.1 Nemotron Nano 8B V1
95.4%
#9DeepSeek R1 Zero
95.9%
#10DeepSeek R1 Distill Llama 70B
94.5%
#11DeepSeek R1 Distill Qwen 32B
94.3%
#12DeepSeek-V3 0324
94.0%
#13DeepSeek R1 Distill Qwen 14B
93.9%
AIME 2024
Rank #9 of 41
#6o3-mini
87.3%
#7Gemini 2.5 Flash
88.0%
#8DeepSeek-R1-0528
91.4%
#9DeepSeek R1 Distill Llama 70B
86.7%
#10DeepSeek R1 Zero
86.7%
#11o1-pro
86.0%
#12Qwen3 235B A22B
85.7%
GPQA
Rank #41 of 115
#38QwQ-32B-Preview
65.2%
#39Phi 4 Reasoning
65.8%
#40Qwen3 30B A3B
65.8%
#41DeepSeek R1 Distill Llama 70B
65.2%
#42QwQ-32B
65.2%
#43GPT-4.1 mini
65.0%
#44Gemini 2.5 Flash-Lite
64.6%
LiveCodeBench
Rank #13 of 44
#10Qwen3 30B A3B
62.6%
#11QwQ-32B
63.4%
#12Qwen3 32B
65.7%
#13DeepSeek R1 Distill Llama 70B
57.5%
#14DeepSeek R1 Distill Qwen 32B
57.2%
#15Qwen2.5 72B Instruct
55.5%
#16Phi 4 Reasoning
53.8%
All Benchmark Results for DeepSeek R1 Distill Llama 70B
Complete list of benchmark scores with detailed information
MATH-500 MATH-500 benchmark | math | text | 0.94 | 94.5% | Self-reported |
AIME 2024 AIME 2024 benchmark | general | text | 0.87 | 86.7% | Self-reported |
GPQA GPQA benchmark | general | text | 0.65 | 65.2% | Self-reported |
LiveCodeBench LiveCodeBench benchmark | code | text | 0.57 | 57.5% | Self-reported |