Mistral AI

Magistral Medium

Multimodal
Zero-eval

by Mistral AI

About

Magistral Medium is a multimodal language model developed by Mistral AI. The model shows competitive results across 6 benchmarks. Notable strengths include AIME 2024 (73.6%), GPQA (70.8%), AIME 2025 (64.9%). As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Mistral AI's latest advancement in AI technology.

Timeline
AnnouncedJun 10, 2025
ReleasedJun 10, 2025
Knowledge CutoffJun 1, 2025
Specifications
Capabilities
Multimodal
License & Family
License
Apache 2.0
Benchmark Performance Overview
Performance metrics and category breakdown

Overall Performance

6 benchmarks
Average Score
52.6%
Best Score
73.6%
High Performers (80%+)
0

Top Categories

general
53.1%
code
50.3%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark

AIME 2024

Rank #28 of 41
#25o1
74.3%
#26Phi 4 Reasoning
75.3%
#27Kimi-k1.5
77.5%
#28Magistral Medium
73.6%
#29Gemini 2.0 Flash Thinking
73.3%
#30Magistral Small 2506
70.7%
#31Kimi K2 Instruct
69.6%

GPQA

Rank #28 of 115
#25GPT-5 nano
71.2%
#26DeepSeek-R1
71.5%
#27GPT OSS 120B
71.5%
#28Magistral Medium
70.8%
#29GPT-4o
70.1%
#30Llama 4 Maverick
69.8%
#31GPT-4.5
69.5%

AIME 2025

Rank #22 of 36
#19Claude Sonnet 4
70.5%
#20Qwen3 30B A3B
70.9%
#21Gemini 2.5 Flash
72.0%
#22Magistral Medium
64.9%
#23Phi 4 Reasoning
62.9%
#24Magistral Small 2506
62.8%
#25Llama-3.3 Nemotron Super 49B v1
58.4%

LiveCodeBench

Rank #20 of 44
#17Magistral Small 2506
51.3%
#18DeepSeek R1 Distill Qwen 14B
53.1%
#19Phi 4 Reasoning Plus
53.1%
#20Magistral Medium
50.3%
#21DeepSeek R1 Zero
50.0%
#22QwQ-32B-Preview
50.0%
#23DeepSeek-V3 0324
49.2%

Aider-Polyglot

Rank #14 of 18
#11DeepSeek-V3
49.6%
#12GPT-4.1
51.6%
#13DeepSeek-R1
53.3%
#14Magistral Medium
47.1%
#15GPT-4.1 mini
34.7%
#16GPT-4o
30.7%
#17Gemini 2.5 Flash-Lite
26.7%
All Benchmark Results for Magistral Medium
Complete list of benchmark scores with detailed information
AIME 2024
AIME 2024 benchmark
general
text
0.74
73.6%
Self-reported
GPQA
GPQA benchmark
general
text
0.71
70.8%
Self-reported
AIME 2025
AIME 2025 benchmark
general
text
0.65
64.9%
Self-reported
LiveCodeBench
LiveCodeBench benchmark
code
text
0.50
50.3%
Self-reported
Aider-Polyglot
Aider-Polyglot benchmark
general
text
0.47
47.1%
Self-reported
Humanity's Last Exam
Humanity's Last Exam benchmark
general
text
0.09
9.0%
Self-reported