Nova Micro
Zero-eval
#3Translation Set1→en COMET22
#3Translation en→Set1 COMET22
#3FinQA
+3 more
by Amazon
About
Nova Micro is a language model developed by Amazon. It achieves strong performance with an average score of 67.0% across 17 benchmarks. It excels particularly in GSM8k (92.3%), ARC-C (90.2%), Translation Set1→en COMET22 (88.7%). It supports a 256K token context window for handling large documents. The model is available through 1 API provider. Released in 2024, it represents Amazon's latest advancement in AI technology.
Pricing Range
Input (per 1M)$0.03 -$0.03
Output (per 1M)$0.14 -$0.14
Providers1
Timeline
AnnouncedNov 20, 2024
ReleasedNov 20, 2024
Specifications
License & Family
License
Proprietary
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
17 benchmarks
Average Score
67.0%
Best Score
92.3%
High Performers (80%+)
6Performance Metrics
Max Context Window
256.0KAvg Throughput
100.0 tok/sAvg Latency
1msTop Categories
reasoning
90.2%
code
84.2%
math
80.8%
general
60.0%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark
GSM8k
Rank #19 of 46
#16Claude 3 Sonnet
92.3%
#17Mistral Large 2
93.0%
#18Qwen3 235B A22B
94.4%
#19Nova Micro
92.3%
#20Kimi K2 Base
92.1%
#21Qwen2.5 7B Instruct
91.6%
#22Llama 3.1 Nemotron 70B Instruct
91.4%
ARC-C
Rank #10 of 31
#7Phi-3.5-MoE-instruct
91.0%
#8Mistral Small 3 24B Base
91.3%
#9Nova Lite
92.4%
#10Nova Micro
90.2%
#11Claude 3 Haiku
89.2%
#12Jamba 1.5 Mini
85.7%
#13Phi-3.5-mini-instruct
84.6%
Translation Set1→en COMET22
Rank #3 of 3
#1Nova Lite
88.8%
#2Nova Pro
89.0%
#3Nova Micro
88.7%
Translation en→Set1 COMET22
Rank #3 of 3
#1Nova Lite
88.8%
#2Nova Pro
89.1%
#3Nova Micro
88.5%
IFEval
Rank #17 of 37
#14Kimi-k1.5
87.2%
#15GPT-4.1
87.4%
#16Llama 3.1 70B Instruct
87.5%
#17Nova Micro
87.2%
#18DeepSeek-V3
86.1%
#19Phi 4 Reasoning Plus
84.9%
#20Qwen2.5 72B Instruct
84.1%
All Benchmark Results for Nova Micro
Complete list of benchmark scores with detailed information
GSM8k GSM8k benchmark | math | text | 0.92 | 92.3% | Self-reported |
ARC-C ARC-C benchmark | reasoning | text | 0.90 | 90.2% | Self-reported |
Translation Set1→en COMET22 Translation Set1→en COMET22 benchmark | general | text | 0.89 | 88.7% | Self-reported |
Translation en→Set1 COMET22 Translation en→Set1 COMET22 benchmark | general | text | 0.89 | 88.5% | Self-reported |
IFEval IFEval benchmark | code | text | 0.87 | 87.2% | Self-reported |
HumanEval HumanEval benchmark | code | text | 0.81 | 81.1% | Self-reported |
BBH BBH benchmark | general | text | 0.80 | 79.5% | Self-reported |
DROP DROP benchmark | general | text | 0.79 | 79.3% | Self-reported |
MMLU MMLU benchmark | general | text | 0.78 | 77.6% | Self-reported |
MATH MATH benchmark | math | text | 0.69 | 69.3% | Self-reported |
Showing 1 to 10 of 17 benchmarks