Claude 3 Haiku

Name: Claude 3 Haiku
Price: 0.25 USD
Rating: 71.5 (10 reviews)
Author: Anthropic

Multimodal

Zero-eval

by Anthropic

About

Claude 3 Haiku is a multimodal language model developed by Anthropic. It achieves strong performance with an average score of 71.5% across 10 benchmarks. It excels particularly in ARC-C (89.2%), GSM8k (88.9%), HellaSwag (85.9%). The model shows particular specialization in reasoning tasks with an average performance of 87.5%. It supports a 400K token context window for handling large documents. The model is available through 3 API providers. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2024, it represents Anthropic's latest advancement in AI technology.

Pricing Range

Input (per 1M)$0.25 -$0.25

Output (per 1M)$1.25 -$1.25

Providers3

Timeline

AnnouncedMar 13, 2024

ReleasedMar 13, 2024

Specifications

Capabilities

Multimodal

License & Family

License

Proprietary

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

10 benchmarks

Average Score

71.5%

Best Score

89.2%

High Performers (80%+)

Performance Metrics

Max Context Window

400.0K

Avg Throughput

82.0 tok/s

Avg Latency

0ms

Top Categories

reasoning

87.5%

code

75.9%

math

67.6%

general

65.2%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

ARC-C

Rank #11 of 31

#8Nova Micro

90.2%

#9Phi-3.5-MoE-instruct

91.0%

#10Mistral Small 3 24B Base

91.3%

#11Claude 3 Haiku

89.2%

#12Jamba 1.5 Mini

85.7%

#13Phi-3.5-mini-instruct

84.6%

#14Phi 4 Mini

83.7%

GSM8k

Rank #28 of 46

#25Gemma 3 4B

89.2%

#26Grok-1.5

90.0%

#27Gemini 1.5 Pro

90.8%

#28Claude 3 Haiku

88.9%

#29Qwen2.5-Omni-7B

88.7%

#30Phi-3.5-MoE-instruct

88.7%

#31Phi 4 Mini

88.6%

HellaSwag

Rank #9 of 24

#6Gemma 2 27B

86.4%

#7Gemini 1.5 Flash

86.5%

#8Qwen2 72B Instruct

87.6%

#9Claude 3 Haiku

85.9%

#10Llama 3.1 Nemotron 70B Instruct

85.6%

#11Qwen2.5 32B Instruct

85.2%

#12Phi-3.5-MoE-instruct

83.8%

DROP

Rank #17 of 28

#14Claude 3 Sonnet

78.9%

#15Nova Micro

79.3%

#16Llama 3.1 70B Instruct

79.6%

#17Claude 3 Haiku

78.4%

#18Phi 4

75.5%

#19Gemini 1.5 Pro

74.9%

#20GPT-3.5 Turbo

70.2%

HumanEval

Rank #44 of 62

#41Qwen2.5-Omni-7B

78.7%

#42Qwen2 7B Instruct

79.9%

#43Llama 3.1 70B Instruct

80.5%

#44Claude 3 Haiku

75.9%

#45Gemma 3n E4B Instructed

75.0%

#46Gemma 3n E4B Instructed LiteRT Preview

75.0%

#47Gemini 1.5 Flash

74.3%

All Benchmark Results for Claude 3 Haiku

Complete list of benchmark scores with detailed information


ARC-C ARC-C benchmark	reasoning	text	0.89	89.2%	Self-reported
GSM8k GSM8k benchmark	math	text	0.89	88.9%	Self-reported
HellaSwag HellaSwag benchmark	reasoning	text	0.86	85.9%	Self-reported
DROP DROP benchmark	general	text	0.78	78.4%	Self-reported
HumanEval HumanEval benchmark	code	text	0.76	75.9%	Self-reported
MMLU MMLU benchmark	general	text	0.75	75.2%	Self-reported
MGSM MGSM benchmark	math	text	0.75	75.1%	Self-reported
BIG-Bench Hard BIG-Bench Hard benchmark	general	text	0.74	73.7%	Self-reported
MATH MATH benchmark	math	text	0.39	38.9%	Self-reported
GPQA GPQA benchmark	general	text	0.33	33.3%	Self-reported

Resources

API Reference Playground Research Paper Blog Post