
DeepSeek VL2
Multimodal
Zero-eval
#1MME
#2MMT-Bench
#3MMBench-V1.1
+1 more
by DeepSeek
About
DeepSeek VL2 is a multimodal language model developed by DeepSeek. It achieves strong performance with an average score of 70.9% across 14 benchmarks. It excels particularly in DocVQA (93.3%), ChartQA (86.0%), TextVQA (84.2%). The model shows particular specialization in vision tasks with an average performance of 76.7%. It supports a 259K token context window for handling large documents. The model is available through 1 API provider. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2024, it represents DeepSeek's latest advancement in AI technology.
Pricing Range
Input (per 1M)$9.50 -$9.50
Output (per 1M)$4800.00 -$4800.00
Providers1
Timeline
AnnouncedDec 13, 2024
ReleasedDec 13, 2024
Specifications
Capabilities
Multimodal
License & Family
License
deepseek
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
14 benchmarks
Average Score
70.9%
Best Score
93.3%
High Performers (80%+)
5Performance Metrics
Max Context Window
258.6KAvg Throughput
22.0 tok/sAvg Latency
1msTop Categories
vision
76.7%
general
69.9%
roleplay
63.6%
math
62.8%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark
DocVQA
Rank #11 of 26
#8Nova Pro
93.5%
#9Grok-2
93.6%
#10Llama 4 Scout
94.4%
#11DeepSeek VL2
93.3%
#12Pixtral Large
93.3%
#13Grok-2 mini
93.2%
#14Phi-4-multimodal-instruct
93.2%
ChartQA
Rank #11 of 24
#8Nova Lite
86.8%
#9Qwen2.5 VL 7B Instruct
87.3%
#10Mistral Small 3.2 24B Instruct
87.4%
#11DeepSeek VL2
86.0%
#12GPT-4o
85.7%
#13Llama 3.2 90B Instruct
85.5%
#14Qwen2.5-Omni-7B
85.3%
TextVQA
Rank #4 of 15
#1Qwen2.5-Omni-7B
84.4%
#2Qwen2.5 VL 7B Instruct
84.9%
#3Qwen2-VL-72B-Instruct
85.5%
#4DeepSeek VL2
84.2%
#5DeepSeek VL2 Small
83.4%
#6Nova Pro
81.5%
#7DeepSeek VL2 Tiny
80.7%
AI2D
Rank #13 of 17
#10Phi-4-multimodal-instruct
82.3%
#11Qwen2.5-Omni-7B
83.2%
#12Gemma 3 12B
84.2%
#13DeepSeek VL2
81.4%
#14DeepSeek VL2 Small
80.0%
#15Phi-3.5-vision-instruct
78.1%
#16Gemma 3 4B
74.8%
OCRBench
Rank #6 of 7
#3DeepSeek VL2 Small
83.4%
#4Phi-4-multimodal-instruct
84.4%
#5Qwen2.5 VL 7B Instruct
86.4%
#6DeepSeek VL2
81.1%
#7DeepSeek VL2 Tiny
80.9%
All Benchmark Results for DeepSeek VL2
Complete list of benchmark scores with detailed information
DocVQA DocVQA benchmark | vision | multimodal | 0.93 | 93.3% | Self-reported |
ChartQA ChartQA benchmark | general | multimodal | 0.86 | 86.0% | Self-reported |
TextVQA TextVQA benchmark | vision | multimodal | 0.84 | 84.2% | Self-reported |
AI2D AI2D benchmark | general | text | 0.81 | 81.4% | Self-reported |
OCRBench OCRBench benchmark | general | text | 0.81 | 81.1% | Self-reported |
MMBench MMBench benchmark | general | text | 0.80 | 79.6% | Self-reported |
MMBench-V1.1 MMBench-V1.1 benchmark | general | text | 0.79 | 79.2% | Self-reported |
InfoVQA InfoVQA benchmark | vision | multimodal | 0.78 | 78.1% | Self-reported |
RealWorldQA RealWorldQA benchmark | general | text | 0.68 | 68.4% | Self-reported |
MMT-Bench MMT-Bench benchmark | roleplay | text | 0.64 | 63.6% | Self-reported |
Showing 1 to 10 of 14 benchmarks