DeepSeek-V3

Name: DeepSeek-V3
Price: 0.27 USD
Rating: 67.2 (20 reviews)
Author: DeepSeek

Zero-eval

#1HumanEval-Mul

#1Aider-Polyglot Edit

#1LongBench v2

+3 more

by DeepSeek

About

DeepSeek-V3 is a language model developed by DeepSeek. It achieves strong performance with an average score of 67.2% across 20 benchmarks. It excels particularly in DROP (91.6%), CLUEWSC (90.9%), MATH-500 (90.2%). It supports a 262K token context window for handling large documents. The model is available through 1 API provider. Released in 2024, it represents DeepSeek's latest advancement in AI technology.

Pricing Range

Input (per 1M)$0.27 -$0.27

Output (per 1M)$1.10 -$1.10

Providers1

Timeline

AnnouncedDec 25, 2024

ReleasedDec 25, 2024

Specifications

Training Tokens14.8T

License & Family

License

MIT + Model License (Commercial use allowed)

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

20 benchmarks

Average Score

67.2%

Best Score

91.6%

High Performers (80%+)

Performance Metrics

Max Context Window

262.1K

Avg Throughput

100.0 tok/s

Avg Latency

1ms

Top Categories

math

90.2%

code

73.2%

general

65.1%

long_context

48.7%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

DROP

Rank #2 of 28

#1DeepSeek-R1

92.2%

#2DeepSeek-V3

91.6%

#3Claude 3.5 Sonnet

87.1%

#4Claude 3.5 Sonnet

87.1%

#5GPT-4 Turbo

86.0%

CLUEWSC

Rank #3 of 3

#1Kimi-k1.5

91.4%

#2DeepSeek-R1

92.8%

#3DeepSeek-V3

90.9%

MATH-500

Rank #17 of 22

#14QwQ-32B-Preview

90.6%

#15QwQ-32B

90.6%

#16DeepSeek R1 Distill Qwen 7B

92.8%

#17DeepSeek-V3

90.2%

#18o1-mini

90.0%

#19DeepSeek R1 Distill Llama 8B

89.1%

#20DeepSeek R1 Distill Qwen 1.5B

83.9%

MMLU-Redux

Rank #5 of 13

#2Kimi K2 Instruct

92.7%

#3DeepSeek-R1

92.9%

#4Qwen3-235B-A22B-Instruct-2507

93.1%

#5DeepSeek-V3

89.1%

#6Qwen3 235B A22B

87.4%

#7Qwen2.5 72B Instruct

86.8%

#8Qwen2.5 32B Instruct

83.9%

MMLU

Rank #11 of 78

#8GPT-4o

88.7%

#9Kimi K2 Instruct

89.5%

#10GPT-4.1

90.2%

#11DeepSeek-V3

88.5%

#12Qwen3 235B A22B

87.8%

#13Kimi K2 Base

87.8%

#14GPT-4.1 mini

87.5%

All Benchmark Results for DeepSeek-V3

Complete list of benchmark scores with detailed information


DROP DROP benchmark	general	text	0.92	91.6%	Self-reported
CLUEWSC CLUEWSC benchmark	general	text	0.91	90.9%	Self-reported
MATH-500 MATH-500 benchmark	math	text	0.90	90.2%	Self-reported
MMLU-Redux MMLU-Redux benchmark	general	text	0.89	89.1%	Self-reported
MMLU MMLU benchmark	general	text	0.89	88.5%	Self-reported
C-Eval C-Eval benchmark	code	text	0.86	86.5%	Self-reported
IFEval IFEval benchmark	code	text	0.86	86.1%	Self-reported
HumanEval-Mul HumanEval-Mul benchmark	code	text	0.83	82.6%	Self-reported
Aider-Polyglot Edit Aider-Polyglot Edit benchmark	general	text	0.80	79.7%	Self-reported
MMLU-Pro MMLU-Pro benchmark	general	text	0.76	75.9%	Self-reported

Showing 1 to 10 of 20 benchmarks

Resources

API Reference Playground Research Paper Repository Model Weights