Google

Gemma 2 9B

Zero-eval
#2ARC-E
#3BoolQ
#3PIQA
+3 more

by Google

About

Gemma 2 9B is a language model developed by Google. It achieves strong performance with an average score of 64.6% across 16 benchmarks. It excels particularly in ARC-E (88.0%), BoolQ (84.2%), HellaSwag (81.9%). The model shows particular specialization in reasoning tasks with an average performance of 79.7%. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Google's latest advancement in AI technology.

Timeline
AnnouncedJun 27, 2024
ReleasedJun 27, 2024
Specifications
Training Tokens8.0T
License & Family
License
Gemma
Benchmark Performance Overview
Performance metrics and category breakdown

Overall Performance

16 benchmarks
Average Score
64.6%
Best Score
88.0%
High Performers (80%+)
5

Top Categories

reasoning
79.7%
general
66.4%
math
52.6%
code
48.5%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark

ARC-E

Rank #2 of 6
#1Gemma 2 27B
88.6%
#2Gemma 2 9B
88.0%
#3Gemma 3n E4B
81.6%
#4Gemma 3n E4B Instructed LiteRT Preview
81.6%
#5Gemma 3n E2B Instructed LiteRT (Preview)
75.8%

BoolQ

Rank #3 of 9
#1Phi-3.5-MoE-instruct
84.6%
#2Gemma 2 27B
84.8%
#3Gemma 2 9B
84.2%
#4Gemma 3n E4B
81.6%
#5Gemma 3n E4B Instructed LiteRT Preview
81.6%
#6Phi 4 Mini
81.2%

HellaSwag

Rank #15 of 24
#12Qwen2.5-Coder 32B Instruct
83.0%
#13Mistral NeMo Instruct
83.5%
#14Phi-3.5-MoE-instruct
83.8%
#15Gemma 2 9B
81.9%
#16Granite 3.3 8B Base
80.1%
#17Gemma 3n E4B Instructed LiteRT Preview
78.6%
#18Gemma 3n E4B
78.6%

PIQA

Rank #3 of 9
#1Gemma 2 27B
83.2%
#2Phi-3.5-MoE-instruct
88.6%
#3Gemma 2 9B
81.7%
#4Phi-3.5-mini-instruct
81.0%
#5Gemma 3n E4B Instructed LiteRT Preview
81.0%
#6Gemma 3n E4B
81.0%

Winogrande

Rank #9 of 19
#6Qwen2.5-Coder 32B Instruct
80.8%
#7Phi-3.5-MoE-instruct
81.3%
#8Qwen2.5 32B Instruct
82.0%
#9Gemma 2 9B
80.6%
#10Mistral NeMo Instruct
76.8%
#11Ministral 8B Instruct
75.3%
#12Granite 3.3 8B Base
74.4%
All Benchmark Results for Gemma 2 9B
Complete list of benchmark scores with detailed information
ARC-E
ARC-E benchmark
reasoning
text
0.88
88.0%
Self-reported
BoolQ
BoolQ benchmark
general
text
0.84
84.2%
Self-reported
HellaSwag
HellaSwag benchmark
reasoning
text
0.82
81.9%
Self-reported
PIQA
PIQA benchmark
general
text
0.82
81.7%
Self-reported
Winogrande
Winogrande benchmark
reasoning
text
0.81
80.6%
Self-reported
TriviaQA
TriviaQA benchmark
general
text
0.77
76.6%
Self-reported
MMLU
MMLU benchmark
general
text
0.71
71.3%
Self-reported
GSM8k
GSM8k benchmark
math
text
0.69
68.6%
Self-reported
ARC-C
ARC-C benchmark
reasoning
text
0.68
68.4%
Self-reported
BIG-Bench
BIG-Bench benchmark
general
text
0.68
68.2%
Self-reported
Showing 1 to 10 of 16 benchmarks