
Llama 4 Maverick
Multimodal
Zero-eval
#1MGSM
#1TydiQA
#2ChartQA
+1 more
by Meta
About
Llama 4 Maverick is a multimodal language model developed by Meta. It achieves strong performance with an average score of 71.8% across 13 benchmarks. It excels particularly in DocVQA (94.4%), MGSM (92.3%), ChartQA (90.0%). The model shows particular specialization in vision tasks with an average performance of 75.8%. With a 2.0M token context window, it can handle extensive documents and complex multi-turn conversations. The model is available through 7 API providers. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2025, it represents Meta's latest advancement in AI technology.
Pricing Range
Input (per 1M)$0.17 -$0.63
Output (per 1M)$0.60 -$1.79
Providers7
Timeline
AnnouncedApr 5, 2025
ReleasedApr 5, 2025
Specifications
Training Tokens22.0T
Capabilities
Multimodal
License & Family
License
Llama 4 Community License Agreement
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
13 benchmarks
Average Score
71.8%
Best Score
94.4%
High Performers (80%+)
5Performance Metrics
Max Context Window
2.0MAvg Throughput
193.4 tok/sAvg Latency
1msTop Categories
vision
75.8%
math
75.7%
general
71.5%
code
60.5%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark
DocVQA
Rank #7 of 26
#4Qwen2.5 VL 32B Instruct
94.8%
#5Mistral Small 3.2 24B Instruct
94.9%
#6Claude 3.5 Sonnet
95.2%
#7Llama 4 Maverick
94.4%
#8Llama 4 Scout
94.4%
#9Grok-2
93.6%
#10Nova Pro
93.5%
MGSM
Rank #1 of 31
#1Llama 4 Maverick
92.3%
#2o3-mini
92.0%
#3Claude 3.5 Sonnet
91.6%
#4Claude 3.5 Sonnet
91.6%
ChartQA
Rank #2 of 24
#1Claude 3.5 Sonnet
90.8%
#2Llama 4 Maverick
90.0%
#3Qwen2.5 VL 72B Instruct
89.5%
#4Nova Pro
89.2%
#5Llama 4 Scout
88.8%
MMLU
Rank #28 of 78
#25GPT-4o
85.7%
#26Nova Pro
85.9%
#27Gemini 1.5 Pro
85.9%
#28Llama 4 Maverick
85.5%
#29o1-mini
85.2%
#30Phi 4
84.8%
#31Mistral Large 2
84.0%
MMLU-Pro
Rank #6 of 60
#3Kimi K2 Instruct
81.1%
#4DeepSeek-V3 0324
81.2%
#5Qwen3-235B-A22B-Instruct-2507
83.0%
#6Llama 4 Maverick
80.5%
#7Claude 3.5 Sonnet
77.6%
#8Gemini 2.0 Flash
76.4%
#9Claude 3.5 Sonnet
76.1%
All Benchmark Results for Llama 4 Maverick
Complete list of benchmark scores with detailed information
DocVQA DocVQA benchmark | vision | multimodal | 0.94 | 94.4% | Self-reported |
MGSM MGSM benchmark | math | text | 0.92 | 92.3% | Self-reported |
ChartQA ChartQA benchmark | general | multimodal | 0.90 | 90.0% | Self-reported |
MMLU MMLU benchmark | general | text | 0.85 | 85.5% | Self-reported |
MMLU-Pro MMLU-Pro benchmark | general | text | 0.81 | 80.5% | Self-reported |
MBPP MBPP benchmark | code | text | 77.60 | 77.6% | Self-reported |
MathVista MathVista benchmark | math | text | 0.74 | 73.7% | Self-reported |
MMMU MMMU benchmark | vision | multimodal | 0.73 | 73.4% | Self-reported |
GPQA GPQA benchmark | general | text | 0.70 | 69.8% | Self-reported |
MATH MATH benchmark | math | text | 0.61 | 61.2% | Self-reported |
Showing 1 to 10 of 13 benchmarks