Meta

Llama 4 Scout

Multimodal
Zero-eval
#2TydiQA

by Meta

About

Llama 4 Scout is a multimodal language model developed by Meta. It achieves strong performance with an average score of 67.3% across 12 benchmarks. It excels particularly in DocVQA (94.4%), MGSM (90.6%), ChartQA (88.8%). The model shows particular specialization in vision tasks with an average performance of 81.9%. With a 20.0M token context window, it can handle extensive documents and complex multi-turn conversations. The model is available through 6 API providers. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2025, it represents Meta's latest advancement in AI technology.

Pricing Range
Input (per 1M)$0.08 -$0.18
Output (per 1M)$0.30 -$0.60
Providers6
Timeline
AnnouncedApr 5, 2025
ReleasedApr 5, 2025
Specifications
Training Tokens40.0T
Capabilities
Multimodal
License & Family
License
Llama 4 Community License Agreement
Benchmark Performance Overview
Performance metrics and category breakdown

Overall Performance

12 benchmarks
Average Score
67.3%
Best Score
94.4%
High Performers (80%+)
3

Performance Metrics

Max Context Window
20.0M
Avg Throughput
214.1 tok/s
Avg Latency
1ms

Top Categories

vision
81.9%
math
70.5%
general
66.3%
code
50.3%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark

DocVQA

Rank #8 of 26
#5Llama 4 Maverick
94.4%
#6Qwen2.5 VL 32B Instruct
94.8%
#7Mistral Small 3.2 24B Instruct
94.9%
#8Llama 4 Scout
94.4%
#9Grok-2
93.6%
#10Nova Pro
93.5%
#11DeepSeek VL2
93.3%

MGSM

Rank #8 of 31
#5Claude 3 Opus
90.7%
#6o1-preview
90.8%
#7Llama 3.3 70B Instruct
91.1%
#8Llama 4 Scout
90.6%
#9GPT-4o
90.5%
#10o1
89.3%
#11GPT-4 Turbo
88.5%

ChartQA

Rank #5 of 24
#2Nova Pro
89.2%
#3Qwen2.5 VL 72B Instruct
89.5%
#4Llama 4 Maverick
90.0%
#5Llama 4 Scout
88.8%
#6Qwen2-VL-72B-Instruct
88.3%
#7Pixtral Large
88.1%
#8Mistral Small 3.2 24B Instruct
87.4%

MMLU

Rank #47 of 78
#44Qwen2.5 14B Instruct
79.7%
#45GPT-4.1 nano
80.1%
#46Llama 3.1 Nemotron 70B Instruct
80.2%
#47Llama 4 Scout
79.6%
#48Claude 3 Sonnet
79.0%
#49Gemini 1.5 Flash
78.9%
#50Phi-3.5-MoE-instruct
78.9%

MMLU-Pro

Rank #16 of 60
#13Phi 4 Reasoning
74.3%
#14GPT-4o
74.7%
#15Grok-2
75.5%
#16Llama 4 Scout
74.3%
#17Llama 3.1 405B Instruct
73.3%
#18GPT-4o
72.6%
#19Grok-2 mini
72.0%
All Benchmark Results for Llama 4 Scout
Complete list of benchmark scores with detailed information
DocVQA
DocVQA benchmark
vision
multimodal
0.94
94.4%
Self-reported
MGSM
MGSM benchmark
math
text
0.91
90.6%
Self-reported
ChartQA
ChartQA benchmark
general
multimodal
0.89
88.8%
Self-reported
MMLU
MMLU benchmark
general
text
0.80
79.6%
Self-reported
MMLU-Pro
MMLU-Pro benchmark
general
text
0.74
74.3%
Self-reported
MathVista
MathVista benchmark
math
text
0.71
70.7%
Self-reported
MMMU
MMMU benchmark
vision
multimodal
0.69
69.4%
Self-reported
MBPP
MBPP benchmark
code
text
67.80
67.8%
Self-reported
GPQA
GPQA benchmark
general
text
0.57
57.2%
Self-reported
MATH
MATH benchmark
math
text
0.50
50.3%
Self-reported
Showing 1 to 10 of 12 benchmarks