DeepSeek VL2

Name: DeepSeek VL2
Price: 9.5 USD
Rating: 70.9 (14 reviews)
Author: DeepSeek

Multimodal

Zero-eval

#1MME

#2MMT-Bench

#3MMBench-V1.1

+1 more

by DeepSeek

About

DeepSeek VL2 is a multimodal language model developed by DeepSeek. It achieves strong performance with an average score of 70.9% across 14 benchmarks. It excels particularly in DocVQA (93.3%), ChartQA (86.0%), TextVQA (84.2%). The model shows particular specialization in vision tasks with an average performance of 76.7%. It supports a 259K token context window for handling large documents. The model is available through 1 API provider. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2024, it represents DeepSeek's latest advancement in AI technology.

Pricing Range

Input (per 1M)$9.50 -$9.50

Output (per 1M)$4800.00 -$4800.00

Providers1

Timeline

AnnouncedDec 13, 2024

ReleasedDec 13, 2024

Specifications

Capabilities

Multimodal

License & Family

License

deepseek

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

14 benchmarks

Average Score

70.9%

Best Score

93.3%

High Performers (80%+)

Performance Metrics

Max Context Window

258.6K

Avg Throughput

22.0 tok/s

Avg Latency

1ms

Top Categories

vision

76.7%

general

69.9%

roleplay

63.6%

math

62.8%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

DocVQA

Rank #11 of 26

#8Nova Pro

93.5%

#9Grok-2

93.6%

#10Llama 4 Scout

94.4%

#11DeepSeek VL2

93.3%

#12Pixtral Large

93.3%

#13Grok-2 mini

93.2%

#14Phi-4-multimodal-instruct

93.2%

ChartQA

Rank #11 of 24

#8Nova Lite

86.8%

#9Qwen2.5 VL 7B Instruct

87.3%

#10Mistral Small 3.2 24B Instruct

87.4%

#11DeepSeek VL2

86.0%

#12GPT-4o

85.7%

#13Llama 3.2 90B Instruct

85.5%

#14Qwen2.5-Omni-7B

85.3%

TextVQA

Rank #4 of 15

#1Qwen2.5-Omni-7B

84.4%

#2Qwen2.5 VL 7B Instruct

84.9%

#3Qwen2-VL-72B-Instruct

85.5%

#4DeepSeek VL2

84.2%

#5DeepSeek VL2 Small

83.4%

#6Nova Pro

81.5%

#7DeepSeek VL2 Tiny

80.7%

AI2D

Rank #13 of 17

#10Phi-4-multimodal-instruct

82.3%

#11Qwen2.5-Omni-7B

83.2%

#12Gemma 3 12B

84.2%

#13DeepSeek VL2

81.4%

#14DeepSeek VL2 Small

80.0%

#15Phi-3.5-vision-instruct

78.1%

#16Gemma 3 4B

74.8%

OCRBench

Rank #6 of 7

#3DeepSeek VL2 Small

83.4%

#4Phi-4-multimodal-instruct

84.4%

#5Qwen2.5 VL 7B Instruct

86.4%

#6DeepSeek VL2

81.1%

#7DeepSeek VL2 Tiny

80.9%

All Benchmark Results for DeepSeek VL2

Complete list of benchmark scores with detailed information


DocVQA DocVQA benchmark	vision	multimodal	0.93	93.3%	Self-reported
ChartQA ChartQA benchmark	general	multimodal	0.86	86.0%	Self-reported
TextVQA TextVQA benchmark	vision	multimodal	0.84	84.2%	Self-reported
AI2D AI2D benchmark	general	text	0.81	81.4%	Self-reported
OCRBench OCRBench benchmark	general	text	0.81	81.1%	Self-reported
MMBench MMBench benchmark	general	text	0.80	79.6%	Self-reported
MMBench-V1.1 MMBench-V1.1 benchmark	general	text	0.79	79.2%	Self-reported
InfoVQA InfoVQA benchmark	vision	multimodal	0.78	78.1%	Self-reported
RealWorldQA RealWorldQA benchmark	general	text	0.68	68.4%	Self-reported
MMT-Bench MMT-Bench benchmark	roleplay	text	0.64	63.6%	Self-reported

Showing 1 to 10 of 14 benchmarks

Resources

API Reference Playground Research Paper Repository Model Weights