Claude 3 Sonnet

Name: Claude 3 Sonnet
Price: 3 USD
Rating: 73.8 (11 reviews)
Author: Anthropic

Multimodal

Zero-eval

by Anthropic

About

Claude 3 Sonnet is a multimodal language model developed by Anthropic. It achieves strong performance with an average score of 73.8% across 11 benchmarks. It excels particularly in ARC-C (93.2%), GSM8k (92.3%), HellaSwag (89.0%). The model shows particular specialization in reasoning tasks with an average performance of 91.1%. It supports a 400K token context window for handling large documents. The model is available through 3 API providers. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2024, it represents Anthropic's latest advancement in AI technology.

Pricing Range

Input (per 1M)$3.00 -$3.00

Output (per 1M)$15.00 -$15.00

Providers3

Timeline

AnnouncedFeb 29, 2024

ReleasedFeb 29, 2024

Specifications

Capabilities

Multimodal

License & Family

License

Proprietary

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

11 benchmarks

Average Score

73.8%

Best Score

93.2%

High Performers (80%+)

Performance Metrics

Max Context Window

400.0K

Avg Throughput

87.3 tok/s

Avg Latency

0ms

Top Categories

reasoning

91.1%

code

73.0%

math

73.0%

general

67.6%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

ARC-C

Rank #5 of 31

#2Llama 3.1 70B Instruct

94.8%

#3Nova Pro

94.8%

#4Claude 3 Opus

96.4%

#5Claude 3 Sonnet

93.2%

#6Jamba 1.5 Large

93.0%

#7Nova Lite

92.4%

#8Mistral Small 3 24B Base

91.3%

GSM8k

Rank #18 of 46

#15Mistral Large 2

93.0%

#16Qwen3 235B A22B

94.4%

#17Gemma 3 12B

94.4%

#18Claude 3 Sonnet

92.3%

#19Nova Micro

92.3%

#20Kimi K2 Base

92.1%

#21Qwen2.5 7B Instruct

91.6%

HellaSwag

Rank #4 of 24

#1Gemini 1.5 Pro

93.3%

#2GPT-4

95.3%

#3Claude 3 Opus

95.4%

#4Claude 3 Sonnet

89.0%

#5Command R+

88.6%

#6Qwen2 72B Instruct

87.6%

#7Gemini 1.5 Flash

86.5%

MGSM

Rank #17 of 31

#14Qwen3 235B A22B

83.5%

#15Claude 3.5 Haiku

85.6%

#16Llama 3.2 90B Instruct

86.9%

#17Claude 3 Sonnet

83.5%

#18Gemini 1.5 Flash

82.6%

#19Phi 4

80.6%

#20Claude 3 Haiku

75.1%

BIG-Bench Hard

Rank #8 of 21

#5Gemini 1.5 Flash

85.5%

#6Gemma 3 12B

85.7%

#7Claude 3 Opus

86.8%

#8Claude 3 Sonnet

82.9%

#9Phi-3.5-MoE-instruct

79.1%

#10Claude 3 Haiku

73.7%

#11Gemma 3 4B

72.2%

All Benchmark Results for Claude 3 Sonnet

Complete list of benchmark scores with detailed information


ARC-C ARC-C benchmark	reasoning	text	0.93	93.2%	Self-reported
GSM8k GSM8k benchmark	math	text	0.92	92.3%	Self-reported
HellaSwag HellaSwag benchmark	reasoning	text	0.89	89.0%	Self-reported
MGSM MGSM benchmark	math	text	0.83	83.5%	Self-reported
BIG-Bench Hard BIG-Bench Hard benchmark	general	text	0.83	82.9%	Self-reported
MMLU MMLU benchmark	general	text	0.79	79.0%	Self-reported
DROP DROP benchmark	general	text	0.79	78.9%	Self-reported
HumanEval HumanEval benchmark	code	text	0.73	73.0%	Self-reported
MMLU-Pro MMLU-Pro benchmark	general	text	0.57	56.8%	Self-reported
MATH MATH benchmark	math	text	0.43	43.1%	Self-reported

Showing 1 to 10 of 11 benchmarks

Resources

API Reference Playground Research Paper Blog Post