
Llama 4 Scout
Multimodal
Zero-eval
#2TydiQA
by Meta
About
Llama 4 Scout is a multimodal language model developed by Meta. It achieves strong performance with an average score of 67.3% across 12 benchmarks. It excels particularly in DocVQA (94.4%), MGSM (90.6%), ChartQA (88.8%). The model shows particular specialization in vision tasks with an average performance of 81.9%. With a 20.0M token context window, it can handle extensive documents and complex multi-turn conversations. The model is available through 6 API providers. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2025, it represents Meta's latest advancement in AI technology.
Pricing Range
Input (per 1M)$0.08 -$0.18
Output (per 1M)$0.30 -$0.60
Providers6
Timeline
AnnouncedApr 5, 2025
ReleasedApr 5, 2025
Specifications
Training Tokens40.0T
Capabilities
Multimodal
License & Family
License
Llama 4 Community License Agreement
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
12 benchmarks
Average Score
67.3%
Best Score
94.4%
High Performers (80%+)
3Performance Metrics
Max Context Window
20.0MAvg Throughput
214.1 tok/sAvg Latency
1msTop Categories
vision
81.9%
math
70.5%
general
66.3%
code
50.3%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark
DocVQA
Rank #8 of 26
#5Llama 4 Maverick
94.4%
#6Qwen2.5 VL 32B Instruct
94.8%
#7Mistral Small 3.2 24B Instruct
94.9%
#8Llama 4 Scout
94.4%
#9Grok-2
93.6%
#10Nova Pro
93.5%
#11DeepSeek VL2
93.3%
MGSM
Rank #8 of 31
#5Claude 3 Opus
90.7%
#6o1-preview
90.8%
#7Llama 3.3 70B Instruct
91.1%
#8Llama 4 Scout
90.6%
#9GPT-4o
90.5%
#10o1
89.3%
#11GPT-4 Turbo
88.5%
ChartQA
Rank #5 of 24
#2Nova Pro
89.2%
#3Qwen2.5 VL 72B Instruct
89.5%
#4Llama 4 Maverick
90.0%
#5Llama 4 Scout
88.8%
#6Qwen2-VL-72B-Instruct
88.3%
#7Pixtral Large
88.1%
#8Mistral Small 3.2 24B Instruct
87.4%
MMLU
Rank #47 of 78
#44Qwen2.5 14B Instruct
79.7%
#45GPT-4.1 nano
80.1%
#46Llama 3.1 Nemotron 70B Instruct
80.2%
#47Llama 4 Scout
79.6%
#48Claude 3 Sonnet
79.0%
#49Gemini 1.5 Flash
78.9%
#50Phi-3.5-MoE-instruct
78.9%
MMLU-Pro
Rank #16 of 60
#13Phi 4 Reasoning
74.3%
#14GPT-4o
74.7%
#15Grok-2
75.5%
#16Llama 4 Scout
74.3%
#17Llama 3.1 405B Instruct
73.3%
#18GPT-4o
72.6%
#19Grok-2 mini
72.0%
All Benchmark Results for Llama 4 Scout
Complete list of benchmark scores with detailed information
DocVQA DocVQA benchmark | vision | multimodal | 0.94 | 94.4% | Self-reported |
MGSM MGSM benchmark | math | text | 0.91 | 90.6% | Self-reported |
ChartQA ChartQA benchmark | general | multimodal | 0.89 | 88.8% | Self-reported |
MMLU MMLU benchmark | general | text | 0.80 | 79.6% | Self-reported |
MMLU-Pro MMLU-Pro benchmark | general | text | 0.74 | 74.3% | Self-reported |
MathVista MathVista benchmark | math | text | 0.71 | 70.7% | Self-reported |
MMMU MMMU benchmark | vision | multimodal | 0.69 | 69.4% | Self-reported |
MBPP MBPP benchmark | code | text | 67.80 | 67.8% | Self-reported |
GPQA GPQA benchmark | general | text | 0.57 | 57.2% | Self-reported |
MATH MATH benchmark | math | text | 0.50 | 50.3% | Self-reported |
Showing 1 to 10 of 12 benchmarks