Mistral AI

Pixtral-12B

Multimodal
Zero-eval
#1MM IF-Eval
#2VQAv2
#2MM-MT-Bench

by Mistral AI

About

Pixtral-12B is a multimodal language model developed by Mistral AI. It achieves strong performance with an average score of 66.9% across 12 benchmarks. It excels particularly in DocVQA (90.7%), ChartQA (81.8%), VQAv2 (78.6%). The model shows particular specialization in general tasks with an average performance of 75.5%. It supports a 136K token context window for handling large documents. The model is available through 1 API provider. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Mistral AI's latest advancement in AI technology.

Pricing Range
Input (per 1M)$0.15 -$0.15
Output (per 1M)$0.15 -$0.15
Providers1
Timeline
AnnouncedSep 17, 2024
ReleasedSep 17, 2024
Specifications
Capabilities
Multimodal
License & Family
License
Apache 2.0
Benchmark Performance Overview
Performance metrics and category breakdown

Overall Performance

12 benchmarks
Average Score
66.9%
Best Score
90.7%
High Performers (80%+)
2

Performance Metrics

Max Context Window
136.2K
Avg Throughput
0.1 tok/s
Avg Latency
1ms

Top Categories

general
75.5%
vision
73.9%
roleplay
68.7%
code
62.0%
math
53.0%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark

DocVQA

Rank #18 of 26
#15DeepSeek VL2 Small
92.3%
#16Nova Lite
92.4%
#17GPT-4o
92.8%
#18Pixtral-12B
90.7%
#19Llama 3.2 90B Instruct
90.1%
#20DeepSeek VL2 Tiny
88.9%
#21Llama 3.2 11B Instruct
88.4%

ChartQA

Rank #17 of 24
#14Llama 3.2 11B Instruct
83.4%
#15DeepSeek VL2 Small
84.5%
#16Qwen2.5-Omni-7B
85.3%
#17Pixtral-12B
81.8%
#18Phi-3.5-vision-instruct
81.8%
#19Phi-4-multimodal-instruct
81.4%
#20DeepSeek VL2 Tiny
81.0%

VQAv2

Rank #2 of 3
#1Pixtral Large
80.9%
#2Pixtral-12B
78.6%
#3Llama 3.2 90B Instruct
78.1%

MT-Bench

Rank #10 of 11
#7Llama 3.1 Nemotron Nano 8B V1
81.0%
#8Ministral 8B Instruct
83.0%
#9Mistral Small 3 24B Instruct
83.5%
#10Pixtral-12B
76.8%
#11Llama 3.1 Nemotron 70B Instruct
9.0%

HumanEval

Rank #51 of 62
#48Llama 3.1 8B Instruct
72.6%
#49Claude 3 Sonnet
73.0%
#50Grok-1.5
74.1%
#51Pixtral-12B
72.0%
#52Gemma 3 4B
71.3%
#53Phi-3.5-MoE-instruct
70.7%
#54GPT-3.5 Turbo
68.0%
All Benchmark Results for Pixtral-12B
Complete list of benchmark scores with detailed information
DocVQA
DocVQA benchmark
vision
multimodal
0.91
90.7%
Self-reported
ChartQA
ChartQA benchmark
general
multimodal
0.82
81.8%
Self-reported
VQAv2
VQAv2 benchmark
vision
multimodal
0.79
78.6%
Self-reported
MT-Bench
MT-Bench benchmark
roleplay
text
76.80
76.8%
Self-reported
HumanEval
HumanEval benchmark
code
text
0.72
72.0%
Self-reported
MMLU
MMLU benchmark
general
text
0.69
69.2%
Self-reported
IFEval
IFEval benchmark
code
text
0.61
61.3%
Self-reported
MM-MT-Bench
MM-MT-Bench benchmark
roleplay
text
60.50
60.5%
Self-reported
MathVista
MathVista benchmark
math
text
0.58
58.0%
Self-reported
MM IF-Eval
MM IF-Eval benchmark
code
text
0.53
52.7%
Self-reported
Showing 1 to 10 of 12 benchmarks