
Mistral Small 3.1 24B Instruct
Multimodal
Zero-eval
by Mistral AI
About
Mistral Small 3.1 24B Instruct is a multimodal language model developed by Mistral AI. It achieves strong performance with an average score of 64.0% across 9 benchmarks. It excels particularly in HumanEval (88.4%), MMLU (80.6%), TriviaQA (80.5%). The model shows particular specialization in code tasks with an average performance of 81.6%. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Mistral AI's latest advancement in AI technology.
Timeline
AnnouncedMar 17, 2025
ReleasedMar 17, 2025
Specifications
Capabilities
Multimodal
License & Family
License
Apache 2.0
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
9 benchmarks
Average Score
64.0%
Best Score
88.4%
High Performers (80%+)
3Top Categories
code
81.6%
math
69.3%
vision
59.3%
general
56.9%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark
HumanEval
Rank #16 of 62
#13DeepSeek-V2.5
89.0%
#14Nova Pro
89.0%
#15Llama 3.1 405B Instruct
89.0%
#16Mistral Small 3.1 24B Instruct
88.4%
#17Llama 3.3 70B Instruct
88.4%
#18Grok-2
88.4%
#19Qwen2.5-Coder 7B Instruct
88.4%
MMLU
Rank #40 of 78
#37Mistral Small 3 24B Base
80.7%
#38Mistral Small 3.1 24B Base
81.0%
#39Jamba 1.5 Large
81.2%
#40Mistral Small 3.1 24B Instruct
80.6%
#41Mistral Small 3.2 24B Instruct
80.5%
#42Nova Lite
80.5%
#43DeepSeek-V2.5
80.4%
TriviaQA
Rank #4 of 13
#1Mistral Small 3.1 24B Base
80.5%
#2Gemma 2 27B
83.7%
#3Kimi K2 Base
85.1%
#4Mistral Small 3.1 24B Instruct
80.5%
#5Mistral Small 3 24B Base
80.3%
#6Granite 3.3 8B Base
78.2%
#7Gemma 2 9B
76.6%
MBPP
Rank #16 of 31
#13Gemini Diffusion
76.0%
#14Llama 4 Maverick
77.6%
#15Codestral-22B
78.2%
#16Mistral Small 3.1 24B Instruct
74.7%
#17Gemma 3 27B
74.4%
#18Qwen2.5-Omni-7B
73.2%
#19Gemma 3 12B
73.0%
MATH
Rank #37 of 63
#34Nova Micro
69.3%
#35Claude 3.5 Haiku
69.4%
#36Mistral Small 3.2 24B Instruct
69.4%
#37Mistral Small 3.1 24B Instruct
69.3%
#38Llama 3.2 90B Instruct
68.0%
#39Phi 4 Mini
64.0%
#40Llama 4 Maverick
61.2%
All Benchmark Results for Mistral Small 3.1 24B Instruct
Complete list of benchmark scores with detailed information
HumanEval HumanEval benchmark | code | text | 0.88 | 88.4% | Self-reported |
MMLU MMLU benchmark | general | text | 0.81 | 80.6% | Self-reported |
TriviaQA TriviaQA benchmark | general | text | 0.81 | 80.5% | Self-reported |
MBPP MBPP benchmark | code | text | 74.71 | 74.7% | Self-reported |
MATH MATH benchmark | math | text | 0.69 | 69.3% | Self-reported |
MMLU-Pro MMLU-Pro benchmark | general | text | 0.67 | 66.8% | Self-reported |
MMMU MMMU benchmark | vision | multimodal | 0.59 | 59.3% | Self-reported |
GPQA GPQA benchmark | general | text | 0.46 | 46.0% | Self-reported |
SimpleQA SimpleQA benchmark | general | text | 0.10 | 10.4% | Self-reported |
Resources