
Mistral NeMo Instruct
Zero-eval
#1CommonSenseQA
#2Natural Questions
by Mistral AI
About
Mistral NeMo Instruct is a language model developed by Mistral AI. It achieves strong performance with an average score of 64.3% across 8 benchmarks. It excels particularly in HellaSwag (83.5%), Winogrande (76.8%), TriviaQA (73.8%). The model shows particular specialization in reasoning tasks with an average performance of 80.2%. It supports a 256K token context window for handling large documents. The model is available through 2 API providers. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Mistral AI's latest advancement in AI technology.
Pricing Range
Input (per 1M)$0.15 -$0.15
Output (per 1M)$0.15 -$0.15
Providers2
Timeline
AnnouncedJul 18, 2024
ReleasedJul 18, 2024
Specifications
License & Family
License
Apache 2.0
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
8 benchmarks
Average Score
64.3%
Best Score
83.5%
High Performers (80%+)
1Performance Metrics
Max Context Window
256.0KAvg Throughput
21.1 tok/sAvg Latency
0msTop Categories
reasoning
80.2%
general
60.8%
factuality
50.3%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark
HellaSwag
Rank #13 of 24
#10Phi-3.5-MoE-instruct
83.8%
#11Qwen2.5 32B Instruct
85.2%
#12Llama 3.1 Nemotron 70B Instruct
85.6%
#13Mistral NeMo Instruct
83.5%
#14Qwen2.5-Coder 32B Instruct
83.0%
#15Gemma 2 9B
81.9%
#16Granite 3.3 8B Base
80.1%
Winogrande
Rank #10 of 19
#7Gemma 2 9B
80.6%
#8Qwen2.5-Coder 32B Instruct
80.8%
#9Phi-3.5-MoE-instruct
81.3%
#10Mistral NeMo Instruct
76.8%
#11Ministral 8B Instruct
75.3%
#12Granite 3.3 8B Base
74.4%
#13Qwen2.5-Coder 7B Instruct
72.9%
TriviaQA
Rank #8 of 13
#5Gemma 2 9B
76.6%
#6Granite 3.3 8B Base
78.2%
#7Mistral Small 3 24B Base
80.3%
#8Mistral NeMo Instruct
73.8%
#9Gemma 3n E4B
70.2%
#10Gemma 3n E4B Instructed LiteRT Preview
70.2%
#11Ministral 8B Instruct
65.5%
CommonSenseQA
Rank #1 of 1
#1Mistral NeMo Instruct
70.4%
MMLU
Rank #66 of 78
#63Phi-3.5-mini-instruct
69.0%
#64Pixtral-12B
69.2%
#65Llama 3.1 8B Instruct
69.4%
#66Mistral NeMo Instruct
68.0%
#67Qwen2.5-Coder 7B Instruct
67.6%
#68Phi 4 Mini
67.3%
#69Granite 3.3 8B Instruct
65.5%
All Benchmark Results for Mistral NeMo Instruct
Complete list of benchmark scores with detailed information
HellaSwag HellaSwag benchmark | reasoning | text | 0.83 | 83.5% | Self-reported |
Winogrande Winogrande benchmark | reasoning | text | 0.77 | 76.8% | Self-reported |
TriviaQA TriviaQA benchmark | general | text | 0.74 | 73.8% | Self-reported |
CommonSenseQA CommonSenseQA benchmark | general | text | 0.70 | 70.4% | Self-reported |
MMLU MMLU benchmark | general | text | 0.68 | 68.0% | Self-reported |
OpenBookQA OpenBookQA benchmark | general | text | 0.61 | 60.6% | Self-reported |
TruthfulQA TruthfulQA benchmark | factuality | text | 0.50 | 50.3% | Self-reported |
Natural Questions Natural Questions benchmark | general | text | 0.31 | 31.2% | Self-reported |