Magistral Medium

Name: Magistral Medium
Rating: 52.6 (6 reviews)
Author: Mistral AI

Multimodal

Zero-eval

by Mistral AI

About

Magistral Medium is a multimodal language model developed by Mistral AI. The model shows competitive results across 6 benchmarks. Notable strengths include AIME 2024 (73.6%), GPQA (70.8%), AIME 2025 (64.9%). As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Mistral AI's latest advancement in AI technology.

Timeline

AnnouncedJun 10, 2025

ReleasedJun 10, 2025

Knowledge CutoffJun 1, 2025

Specifications

Capabilities

Multimodal

License & Family

License

Apache 2.0

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

6 benchmarks

Average Score

52.6%

Best Score

73.6%

High Performers (80%+)

Top Categories

general

53.1%

code

50.3%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

AIME 2024

Rank #28 of 41

#25o1

74.3%

#26Phi 4 Reasoning

75.3%

#27Kimi-k1.5

77.5%

#28Magistral Medium

73.6%

#29Gemini 2.0 Flash Thinking

73.3%

#30Magistral Small 2506

70.7%

#31Kimi K2 Instruct

69.6%

GPQA

Rank #28 of 115

#25GPT-5 nano

71.2%

#26DeepSeek-R1

71.5%

#27GPT OSS 120B

71.5%

#28Magistral Medium

70.8%

#29GPT-4o

70.1%

#30Llama 4 Maverick

69.8%

#31GPT-4.5

69.5%

AIME 2025

Rank #22 of 36

#19Claude Sonnet 4

70.5%

#20Qwen3 30B A3B

70.9%

#21Gemini 2.5 Flash

72.0%

#22Magistral Medium

64.9%

#23Phi 4 Reasoning

62.9%

#24Magistral Small 2506

62.8%

#25Llama-3.3 Nemotron Super 49B v1

58.4%

LiveCodeBench

Rank #20 of 44

#17Magistral Small 2506

51.3%

#18DeepSeek R1 Distill Qwen 14B

53.1%

#19Phi 4 Reasoning Plus

53.1%

#20Magistral Medium

50.3%

#21DeepSeek R1 Zero

50.0%

#22QwQ-32B-Preview

50.0%

#23DeepSeek-V3 0324

49.2%

Aider-Polyglot

Rank #14 of 18

#11DeepSeek-V3

49.6%

#12GPT-4.1

51.6%

#13DeepSeek-R1

53.3%

#14Magistral Medium

47.1%

#15GPT-4.1 mini

34.7%

#16GPT-4o

30.7%

#17Gemini 2.5 Flash-Lite

26.7%

All Benchmark Results for Magistral Medium

Complete list of benchmark scores with detailed information


AIME 2024 AIME 2024 benchmark	general	text	0.74	73.6%	Self-reported
GPQA GPQA benchmark	general	text	0.71	70.8%	Self-reported
AIME 2025 AIME 2025 benchmark	general	text	0.65	64.9%	Self-reported
LiveCodeBench LiveCodeBench benchmark	code	text	0.50	50.3%	Self-reported
Aider-Polyglot Aider-Polyglot benchmark	general	text	0.47	47.1%	Self-reported
Humanity's Last Exam Humanity's Last Exam benchmark	general	text	0.09	9.0%	Self-reported

Resources

API Reference Playground Research Paper Blog Post