Jamba 1.5 Large

Zero-eval
#3Wild Bench

by AI21 Labs

About

Jamba 1.5 Large is a language model developed by AI21 Labs. It achieves strong performance with an average score of 65.5% across 8 benchmarks. It excels particularly in ARC-C (93.0%), GSM8k (87.0%), MMLU (81.2%). It supports a 512K token context window for handling large documents. The model is available through 2 API providers. Released in 2024, it represents AI21 Labs's latest advancement in AI technology.

Pricing Range
Input (per 1M)$2.00 -$2.00
Output (per 1M)$8.00 -$8.00
Providers2
Timeline
AnnouncedAug 22, 2024
ReleasedAug 22, 2024
Knowledge CutoffMar 5, 2024
Specifications
License & Family
License
Jamba Open Model License
Benchmark Performance Overview
Performance metrics and category breakdown

Overall Performance

8 benchmarks
Average Score
65.5%
Best Score
93.0%
High Performers (80%+)
3

Performance Metrics

Max Context Window
512.0K
Avg Throughput
71.0 tok/s
Avg Latency
0ms

Top Categories

reasoning
93.0%
math
87.0%
factuality
58.3%
general
57.1%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark

ARC-C

Rank #6 of 31
#3Claude 3 Sonnet
93.2%
#4Llama 3.1 70B Instruct
94.8%
#5Nova Pro
94.8%
#6Jamba 1.5 Large
93.0%
#7Nova Lite
92.4%
#8Mistral Small 3 24B Base
91.3%
#9Phi-3.5-MoE-instruct
91.0%

GSM8k

Rank #32 of 46
#29Phi 4 Mini
88.6%
#30Phi-3.5-MoE-instruct
88.7%
#31Qwen2.5-Omni-7B
88.7%
#32Jamba 1.5 Large
87.0%
#33Phi-3.5-mini-instruct
86.2%
#34Gemini 1.5 Flash
86.2%
#35Qwen2.5-Coder 7B Instruct
83.9%

MMLU

Rank #37 of 78
#34Grok-1.5
81.3%
#35GPT-4o mini
82.0%
#36Qwen2 72B Instruct
82.3%
#37Jamba 1.5 Large
81.2%
#38Mistral Small 3.1 24B Base
81.0%
#39Mistral Small 3 24B Base
80.7%
#40Mistral Small 3.1 24B Instruct
80.6%

Arena Hard

Rank #13 of 22
#10Ministral 8B Instruct
70.9%
#11Phi 4 Reasoning
73.3%
#12Phi 4
75.4%
#13Jamba 1.5 Large
65.4%
#14Granite 3.3 8B Instruct
57.6%
#15Granite 3.3 8B Base
57.6%
#16Qwen2.5 7B Instruct
52.0%

TruthfulQA

Rank #7 of 16
#4Qwen2.5 14B Instruct
58.4%
#5Llama 3.1 Nemotron 70B Instruct
58.6%
#6Phi-3.5-mini-instruct
64.0%
#7Jamba 1.5 Large
58.3%
#8IBM Granite 4.0 Tiny Preview
58.1%
#9Qwen2.5 32B Instruct
57.8%
#10Command R+
56.3%
All Benchmark Results for Jamba 1.5 Large
Complete list of benchmark scores with detailed information
ARC-C
ARC-C benchmark
reasoning
text
0.93
93.0%
Self-reported
GSM8k
GSM8k benchmark
math
text
0.87
87.0%
Self-reported
MMLU
MMLU benchmark
general
text
0.81
81.2%
Self-reported
Arena Hard
Arena Hard benchmark
general
text
0.65
65.4%
Self-reported
TruthfulQA
TruthfulQA benchmark
factuality
text
0.58
58.3%
Self-reported
MMLU-Pro
MMLU-Pro benchmark
general
text
0.54
53.5%
Self-reported
Wild Bench
Wild Bench benchmark
general
text
0.48
48.5%
Self-reported
GPQA
GPQA benchmark
general
text
0.37
36.9%
Self-reported