DeepSeek R1 Zero

Name: DeepSeek R1 Zero
Rating: 76.5 (4 reviews)
Author: DeepSeek

Zero-eval

by DeepSeek

About

DeepSeek R1 Zero is a language model developed by DeepSeek. It achieves strong performance with an average score of 76.5% across 4 benchmarks. It excels particularly in MATH-500 (95.9%), AIME 2024 (86.7%), GPQA (73.3%). It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents DeepSeek's latest advancement in AI technology.

Timeline

AnnouncedJan 20, 2025

ReleasedJan 20, 2025

Specifications

Training Tokens14.8T

License & Family

License

MIT

Base ModelDeepSeek-V3

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

4 benchmarks

Average Score

76.5%

Best Score

95.9%

High Performers (80%+)

Top Categories

math

95.9%

general

80.0%

code

50.0%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

MATH-500

Rank #7 of 22

#4Kimi-k1.5

96.2%

#5Claude 3.7 Sonnet

96.2%

#6Llama-3.3 Nemotron Super 49B v1

96.6%

#7DeepSeek R1 Zero

95.9%

#8Llama 3.1 Nemotron Nano 8B V1

95.4%

#9Phi 4 Mini Reasoning

94.6%

#10DeepSeek R1 Distill Llama 70B

94.5%

AIME 2024

Rank #10 of 41

#7DeepSeek R1 Distill Llama 70B

86.7%

#8o3-mini

87.3%

#9Gemini 2.5 Flash

88.0%

#10DeepSeek R1 Zero

86.7%

#11o1-pro

86.0%

#12Qwen3 235B A22B

85.7%

#13DeepSeek R1 Distill Qwen 7B

83.3%

GPQA

Rank #23 of 115

#20Gemini 2.0 Flash Thinking

74.2%

#21Kimi K2 Instruct

75.1%

#22Claude Sonnet 4

75.4%

#23DeepSeek R1 Zero

73.3%

#24o1-preview

73.3%

#25GPT OSS 120B

71.5%

#26DeepSeek-R1

71.5%

LiveCodeBench

Rank #21 of 44

#18Magistral Medium

50.3%

#19Magistral Small 2506

51.3%

#20DeepSeek R1 Distill Qwen 14B

53.1%

#21DeepSeek R1 Zero

50.0%

#22QwQ-32B-Preview

50.0%

#23DeepSeek-V3 0324

49.2%

#24Llama 4 Maverick

43.4%

All Benchmark Results for DeepSeek R1 Zero

Complete list of benchmark scores with detailed information


MATH-500 MATH-500 benchmark	math	text	0.96	95.9%	Self-reported
AIME 2024 AIME 2024 benchmark	general	text	0.87	86.7%	Self-reported
GPQA GPQA benchmark	general	text	0.73	73.3%	Self-reported
LiveCodeBench LiveCodeBench benchmark	code	text	0.50	50.0%	Self-reported

Resources

API Reference Playground Research Paper Repository Model Weights