Mistral Small 3.2 24B Instruct

Name: Mistral Small 3.2 24B Instruct
Rating: 69.8 (15 reviews)
Author: Mistral AI

Multimodal

Zero-eval

#1HumanEval Plus

#1IF

#1MBPP Plus

+1 more

by Mistral AI

About

Mistral Small 3.2 24B Instruct is a multimodal language model developed by Mistral AI. It achieves strong performance with an average score of 69.8% across 15 benchmarks. It excels particularly in DocVQA (94.9%), AI2D (92.9%), HumanEval Plus (92.9%). The model shows particular specialization in code tasks with an average performance of 85.6%. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Mistral AI's latest advancement in AI technology.

Timeline

AnnouncedJun 20, 2025

ReleasedJun 20, 2025

Knowledge CutoffOct 1, 2023

Specifications

Capabilities

Multimodal

License & Family

License

Apache 2.0

Base ModelMistral Small 3.1 24B Base

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

15 benchmarks

Average Score

69.8%

Best Score

94.9%

High Performers (80%+)

Top Categories

code

85.6%

vision

78.7%

math

68.3%

general

64.6%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

DocVQA

Rank #5 of 26

#2Claude 3.5 Sonnet

95.2%

#3Qwen2.5-Omni-7B

95.2%

#4Qwen2.5 VL 7B Instruct

95.7%

#5Mistral Small 3.2 24B Instruct

94.9%

#6Qwen2.5 VL 32B Instruct

94.8%

#7Llama 4 Maverick

94.4%

#8Llama 4 Scout

94.4%

AI2D

Rank #4 of 17

#1Pixtral Large

93.8%

#2GPT-4o

94.2%

#3Claude 3.5 Sonnet

94.7%

#4Mistral Small 3.2 24B Instruct

92.9%

#5Llama 3.2 90B Instruct

92.3%

#6Llama 3.2 11B Instruct

91.1%

#7Qwen2.5 VL 72B Instruct

88.4%

HumanEval Plus

Rank #1 of 1

#1Mistral Small 3.2 24B Instruct

92.9%

ChartQA

Rank #8 of 24

#5Pixtral Large

88.1%

#6Qwen2-VL-72B-Instruct

88.3%

#7Llama 4 Scout

88.8%

#8Mistral Small 3.2 24B Instruct

87.4%

#9Qwen2.5 VL 7B Instruct

87.3%

#10Nova Lite

86.8%

#11DeepSeek VL2

86.0%

IF

Rank #1 of 1

#1Mistral Small 3.2 24B Instruct

84.8%

All Benchmark Results for Mistral Small 3.2 24B Instruct

Complete list of benchmark scores with detailed information


DocVQA DocVQA benchmark	vision	multimodal	0.95	94.9%	Self-reported
AI2D AI2D benchmark	general	text	0.93	92.9%	Self-reported
HumanEval Plus HumanEval Plus benchmark	code	text	0.93	92.9%	Self-reported
ChartQA ChartQA benchmark	general	multimodal	0.87	87.4%	Self-reported
IF IF benchmark	general	text	0.85	84.8%	Self-reported
MMLU MMLU benchmark	general	text	0.81	80.5%	Self-reported
MBPP Plus MBPP Plus benchmark	code	text	0.78	78.3%	Self-reported
MATH MATH benchmark	math	text	0.69	69.4%	Self-reported
MMLU-Pro MMLU-Pro benchmark	general	text	0.69	69.1%	Self-reported
MathVista MathVista benchmark	math	text	0.67	67.1%	Self-reported

Showing 1 to 10 of 15 benchmarks

Resources

Playground Model Weights