
Mistral Small 3 24B Base
Multimodal
Zero-eval
#1AGIEval
by Mistral AI
About
Mistral Small 3 24B Base is a multimodal language model developed by Mistral AI. It achieves strong performance with an average score of 67.0% across 9 benchmarks. It excels particularly in ARC-C (91.3%), GSM8k (80.7%), MMLU (80.7%). As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Mistral AI's latest advancement in AI technology.
Timeline
AnnouncedJan 30, 2025
ReleasedJan 30, 2025
Knowledge CutoffOct 1, 2023
Specifications
Capabilities
Multimodal
License & Family
License
Apache 2.0
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
9 benchmarks
Average Score
67.0%
Best Score
91.3%
High Performers (80%+)
4Top Categories
reasoning
91.3%
code
67.7%
math
63.4%
general
62.4%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark
ARC-C
Rank #8 of 31
#5Nova Lite
92.4%
#6Jamba 1.5 Large
93.0%
#7Claude 3 Sonnet
93.2%
#8Mistral Small 3 24B Base
91.3%
#9Phi-3.5-MoE-instruct
91.0%
#10Nova Micro
90.2%
#11Claude 3 Haiku
89.2%
GSM8k
Rank #38 of 46
#35Granite 3.3 8B Instruct
80.9%
#36Qwen2 7B Instruct
82.3%
#37Qwen2.5-Coder 7B Instruct
83.9%
#38Mistral Small 3 24B Base
80.7%
#39Llama 3.2 3B Instruct
77.7%
#40Jamba 1.5 Mini
75.8%
#41Gemma 2 27B
74.0%
MMLU
Rank #39 of 78
#36Mistral Small 3.1 24B Base
81.0%
#37Jamba 1.5 Large
81.2%
#38Grok-1.5
81.3%
#39Mistral Small 3 24B Base
80.7%
#40Mistral Small 3.1 24B Instruct
80.6%
#41Mistral Small 3.2 24B Instruct
80.5%
#42Nova Lite
80.5%
TriviaQA
Rank #5 of 13
#2Mistral Small 3.1 24B Instruct
80.5%
#3Mistral Small 3.1 24B Base
80.5%
#4Gemma 2 27B
83.7%
#5Mistral Small 3 24B Base
80.3%
#6Granite 3.3 8B Base
78.2%
#7Gemma 2 9B
76.6%
#8Mistral NeMo Instruct
73.8%
MBPP
Rank #20 of 31
#17Gemma 3 12B
73.0%
#18Qwen2.5-Omni-7B
73.2%
#19Gemma 3 27B
74.4%
#20Mistral Small 3 24B Base
69.6%
#21Phi-3.5-mini-instruct
69.6%
#22Llama 4 Scout
67.8%
#23Qwen2 7B Instruct
67.2%
All Benchmark Results for Mistral Small 3 24B Base
Complete list of benchmark scores with detailed information
ARC-C ARC-C benchmark | reasoning | text | 0.91 | 91.3% | Self-reported |
GSM8k GSM8k benchmark | math | text | 0.81 | 80.7% | Self-reported |
MMLU MMLU benchmark | general | text | 0.81 | 80.7% | Self-reported |
TriviaQA TriviaQA benchmark | general | text | 0.80 | 80.3% | Self-reported |
MBPP MBPP benchmark | code | text | 69.64 | 69.6% | Self-reported |
AGIEval AGIEval benchmark | code | text | 0.66 | 65.8% | Self-reported |
MMLU-Pro MMLU-Pro benchmark | general | text | 0.54 | 54.4% | Self-reported |
MATH MATH benchmark | math | text | 0.46 | 46.0% | Self-reported |
GPQA GPQA benchmark | general | text | 0.34 | 34.4% | Self-reported |
Resources