Gemma 3 12B

Name: Gemma 3 12B
Price: 0.05 USD
Rating: 63.8 (26 reviews)
Author: Google

Multimodal

Zero-eval

#1VQAv2 (val)

#2MMMU (val)

#2WMT24++

+2 more

by Google

About

Gemma 3 12B is a multimodal language model developed by Google. It achieves strong performance with an average score of 63.8% across 26 benchmarks. It excels particularly in GSM8k (94.4%), IFEval (88.9%), DocVQA (87.1%). It supports a 262K token context window for handling large documents. The model is available through 1 API provider. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Google's latest advancement in AI technology.

Pricing Range

Input (per 1M)$0.05 -$0.05

Output (per 1M)$0.10 -$0.10

Providers1

Timeline

AnnouncedMar 12, 2025

ReleasedMar 12, 2025

Specifications

Training Tokens12.0T

Capabilities

Multimodal

License & Family

License

Gemma

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

26 benchmarks

Average Score

63.8%

Best Score

94.4%

High Performers (80%+)

Performance Metrics

Max Context Window

262.1K

Avg Throughput

33.0 tok/s

Avg Latency

0ms

Top Categories

factuality

75.8%

math

73.9%

code

70.5%

vision

70.2%

general

53.2%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

GSM8k

Rank #15 of 46

#12Nova Lite

94.5%

#13Qwen2.5 14B Instruct

94.8%

#14Nova Pro

94.8%

#15Gemma 3 12B

94.4%

#16Qwen3 235B A22B

94.4%

#17Mistral Large 2

93.0%

#18Claude 3 Sonnet

92.3%

IFEval

Rank #10 of 37

#7Llama 3.1 Nemotron Ultra 253B v1

89.5%

#8Nova Lite

89.7%

#9Kimi K2 Instruct

89.8%

#10Gemma 3 12B

88.9%

#11Qwen3-235B-A22B-Instruct-2507

88.7%

#12Llama 3.1 405B Instruct

88.6%

#13GPT-4.5

88.2%

DocVQA

Rank #22 of 26

#19Llama 3.2 11B Instruct

88.4%

#20DeepSeek VL2 Tiny

88.9%

#21Llama 3.2 90B Instruct

90.1%

#22Gemma 3 12B

87.1%

#23Gemma 3 27B

86.6%

#24Grok-1.5V

85.6%

#25Grok-1.5

85.6%

BIG-Bench Hard

Rank #6 of 21

#3Claude 3 Opus

86.8%

#4Gemma 3 27B

87.6%

#5Gemini 1.5 Pro

89.2%

#6Gemma 3 12B

85.7%

#7Gemini 1.5 Flash

85.5%

#8Claude 3 Sonnet

82.9%

#9Phi-3.5-MoE-instruct

79.1%

HumanEval

Rank #31 of 62

#28Nova Lite

85.4%

#29Grok-2 mini

85.7%

#30Qwen2 72B Instruct

86.0%

#31Gemma 3 12B

85.4%

#32Claude 3 Opus

84.9%

#33Qwen2.5 7B Instruct

84.8%

#34Mistral Small 3 24B Instruct

84.8%

All Benchmark Results for Gemma 3 12B

Complete list of benchmark scores with detailed information


GSM8k GSM8k benchmark	math	text	0.94	94.4%	Self-reported
IFEval IFEval benchmark	code	text	0.89	88.9%	Self-reported
DocVQA DocVQA benchmark	vision	multimodal	0.87	87.1%	Self-reported
BIG-Bench Hard BIG-Bench Hard benchmark	general	text	0.86	85.7%	Self-reported
HumanEval HumanEval benchmark	code	text	0.85	85.4%	Self-reported
AI2D AI2D benchmark	general	text	0.84	84.2%	Self-reported
MATH MATH benchmark	math	text	0.84	83.8%	Self-reported
Natural2Code Natural2Code benchmark	code	text	0.81	80.7%	Self-reported
FACTS Grounding FACTS Grounding benchmark	factuality	text	0.76	75.8%	Self-reported
ChartQA ChartQA benchmark	general	multimodal	0.76	75.7%	Self-reported

Showing 1 to 10 of 26 benchmarks

Resources

Research Paper Blog Post Model Weights