Mistral Small 3.1 24B Instruct

Name: Mistral Small 3.1 24B Instruct
Rating: 64.0 (9 reviews)
Author: Mistral AI

Multimodal

Zero-eval

by Mistral AI

About

Mistral Small 3.1 24B Instruct is a multimodal language model developed by Mistral AI. It achieves strong performance with an average score of 64.0% across 9 benchmarks. It excels particularly in HumanEval (88.4%), MMLU (80.6%), TriviaQA (80.5%). The model shows particular specialization in code tasks with an average performance of 81.6%. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Mistral AI's latest advancement in AI technology.

Timeline

AnnouncedMar 17, 2025

ReleasedMar 17, 2025

Specifications

Capabilities

Multimodal

License & Family

License

Apache 2.0

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

9 benchmarks

Average Score

64.0%

Best Score

88.4%

High Performers (80%+)

Top Categories

code

81.6%

math

69.3%

vision

59.3%

general

56.9%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

HumanEval

Rank #16 of 62

#13DeepSeek-V2.5

89.0%

#14Nova Pro

89.0%

#15Llama 3.1 405B Instruct

89.0%

#16Mistral Small 3.1 24B Instruct

88.4%

#17Llama 3.3 70B Instruct

88.4%

#18Grok-2

88.4%

#19Qwen2.5-Coder 7B Instruct

88.4%

MMLU

Rank #40 of 78

#37Mistral Small 3 24B Base

80.7%

#38Mistral Small 3.1 24B Base

81.0%

#39Jamba 1.5 Large

81.2%

#40Mistral Small 3.1 24B Instruct

80.6%

#41Mistral Small 3.2 24B Instruct

80.5%

#42Nova Lite

80.5%

#43DeepSeek-V2.5

80.4%

TriviaQA

Rank #4 of 13

#1Mistral Small 3.1 24B Base

80.5%

#2Gemma 2 27B

83.7%

#3Kimi K2 Base

85.1%

#4Mistral Small 3.1 24B Instruct

80.5%

#5Mistral Small 3 24B Base

80.3%

#6Granite 3.3 8B Base

78.2%

#7Gemma 2 9B

76.6%

MBPP

Rank #16 of 31

#13Gemini Diffusion

76.0%

#14Llama 4 Maverick

77.6%

#15Codestral-22B

78.2%

#16Mistral Small 3.1 24B Instruct

74.7%

#17Gemma 3 27B

74.4%

#18Qwen2.5-Omni-7B

73.2%

#19Gemma 3 12B

73.0%

MATH

Rank #37 of 63

#34Nova Micro

69.3%

#35Claude 3.5 Haiku

69.4%

#36Mistral Small 3.2 24B Instruct

69.4%

#37Mistral Small 3.1 24B Instruct

69.3%

#38Llama 3.2 90B Instruct

68.0%

#39Phi 4 Mini

64.0%

#40Llama 4 Maverick

61.2%

All Benchmark Results for Mistral Small 3.1 24B Instruct

Complete list of benchmark scores with detailed information


HumanEval HumanEval benchmark	code	text	0.88	88.4%	Self-reported
MMLU MMLU benchmark	general	text	0.81	80.6%	Self-reported
TriviaQA TriviaQA benchmark	general	text	0.81	80.5%	Self-reported
MBPP MBPP benchmark	code	text	74.71	74.7%	Self-reported
MATH MATH benchmark	math	text	0.69	69.3%	Self-reported
MMLU-Pro MMLU-Pro benchmark	general	text	0.67	66.8%	Self-reported
MMMU MMMU benchmark	vision	multimodal	0.59	59.3%	Self-reported
GPQA GPQA benchmark	general	text	0.46	46.0%	Self-reported
SimpleQA SimpleQA benchmark	general	text	0.10	10.4%	Self-reported

Resources

Playground Blog Post Model Weights