Mistral AI

Mistral NeMo Instruct

Zero-eval
#1CommonSenseQA
#2Natural Questions

by Mistral AI

About

Mistral NeMo Instruct is a language model developed by Mistral AI. It achieves strong performance with an average score of 64.3% across 8 benchmarks. It excels particularly in HellaSwag (83.5%), Winogrande (76.8%), TriviaQA (73.8%). The model shows particular specialization in reasoning tasks with an average performance of 80.2%. It supports a 256K token context window for handling large documents. The model is available through 2 API providers. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Mistral AI's latest advancement in AI technology.

Pricing Range
Input (per 1M)$0.15 -$0.15
Output (per 1M)$0.15 -$0.15
Providers2
Timeline
AnnouncedJul 18, 2024
ReleasedJul 18, 2024
Specifications
License & Family
License
Apache 2.0
Benchmark Performance Overview
Performance metrics and category breakdown

Overall Performance

8 benchmarks
Average Score
64.3%
Best Score
83.5%
High Performers (80%+)
1

Performance Metrics

Max Context Window
256.0K
Avg Throughput
21.1 tok/s
Avg Latency
0ms

Top Categories

reasoning
80.2%
general
60.8%
factuality
50.3%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark

HellaSwag

Rank #13 of 24
#10Phi-3.5-MoE-instruct
83.8%
#11Qwen2.5 32B Instruct
85.2%
#12Llama 3.1 Nemotron 70B Instruct
85.6%
#13Mistral NeMo Instruct
83.5%
#14Qwen2.5-Coder 32B Instruct
83.0%
#15Gemma 2 9B
81.9%
#16Granite 3.3 8B Base
80.1%

Winogrande

Rank #10 of 19
#7Gemma 2 9B
80.6%
#8Qwen2.5-Coder 32B Instruct
80.8%
#9Phi-3.5-MoE-instruct
81.3%
#10Mistral NeMo Instruct
76.8%
#11Ministral 8B Instruct
75.3%
#12Granite 3.3 8B Base
74.4%
#13Qwen2.5-Coder 7B Instruct
72.9%

TriviaQA

Rank #8 of 13
#5Gemma 2 9B
76.6%
#6Granite 3.3 8B Base
78.2%
#7Mistral Small 3 24B Base
80.3%
#8Mistral NeMo Instruct
73.8%
#9Gemma 3n E4B
70.2%
#10Gemma 3n E4B Instructed LiteRT Preview
70.2%
#11Ministral 8B Instruct
65.5%

CommonSenseQA

Rank #1 of 1
#1Mistral NeMo Instruct
70.4%

MMLU

Rank #66 of 78
#63Phi-3.5-mini-instruct
69.0%
#64Pixtral-12B
69.2%
#65Llama 3.1 8B Instruct
69.4%
#66Mistral NeMo Instruct
68.0%
#67Qwen2.5-Coder 7B Instruct
67.6%
#68Phi 4 Mini
67.3%
#69Granite 3.3 8B Instruct
65.5%
All Benchmark Results for Mistral NeMo Instruct
Complete list of benchmark scores with detailed information
HellaSwag
HellaSwag benchmark
reasoning
text
0.83
83.5%
Self-reported
Winogrande
Winogrande benchmark
reasoning
text
0.77
76.8%
Self-reported
TriviaQA
TriviaQA benchmark
general
text
0.74
73.8%
Self-reported
CommonSenseQA
CommonSenseQA benchmark
general
text
0.70
70.4%
Self-reported
MMLU
MMLU benchmark
general
text
0.68
68.0%
Self-reported
OpenBookQA
OpenBookQA benchmark
general
text
0.61
60.6%
Self-reported
TruthfulQA
TruthfulQA benchmark
factuality
text
0.50
50.3%
Self-reported
Natural Questions
Natural Questions benchmark
general
text
0.31
31.2%
Self-reported