Llama 3.1 70B Instruct

Name: Llama 3.1 70B Instruct
Price: 5 USD
Rating: 74.7 (18 reviews)
Author: Meta

Zero-eval

#1GSM-8K (CoT)

#1MBPP ++ base version

#1MATH (CoT)

+8 more

by Meta

About

Llama 3.1 70B Instruct is a language model developed by Meta. It achieves strong performance with an average score of 74.7% across 18 benchmarks. It excels particularly in GSM-8K (CoT) (95.1%), ARC-C (94.8%), API-Bank (90.0%). It supports a 256K token context window for handling large documents. The model is available through 9 API providers. Released in 2024, it represents Meta's latest advancement in AI technology.

Pricing Range

Input (per 1M)$0.20 -$5.00

Output (per 1M)$0.20 -$10.00

Providers9

Timeline

AnnouncedJul 23, 2024

ReleasedJul 23, 2024

Specifications

Training Tokens15.0T

License & Family

License

Llama 3.1 Community License

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

18 benchmarks

Average Score

74.7%

Best Score

95.1%

High Performers (80%+)

Performance Metrics

Max Context Window

256.0K

Avg Throughput

213.4 tok/s

Avg Latency

0ms

Top Categories

reasoning

94.8%

math

83.3%

code

76.3%

general

68.7%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

GSM-8K (CoT)

Rank #1 of 2

#1Llama 3.1 70B Instruct

95.1%

#2Llama 3.1 8B Instruct

84.5%

ARC-C

Rank #4 of 31

#1Nova Pro

94.8%

#2Claude 3 Opus

96.4%

#3Llama 3.1 405B Instruct

96.9%

#4Llama 3.1 70B Instruct

94.8%

#5Claude 3 Sonnet

93.2%

#6Jamba 1.5 Large

93.0%

#7Nova Lite

92.4%

API-Bank

Rank #2 of 3

#1Llama 3.1 405B Instruct

92.0%

#2Llama 3.1 70B Instruct

90.0%

#3Llama 3.1 8B Instruct

82.6%

IFEval

Rank #14 of 37

#11GPT-4.5

88.2%

#12Llama 3.1 405B Instruct

88.6%

#13Qwen3-235B-A22B-Instruct-2507

88.7%

#14Llama 3.1 70B Instruct

87.5%

#15GPT-4.1

87.4%

#16Kimi-k1.5

87.2%

#17Nova Micro

87.2%

Multilingual MGSM (CoT)

Rank #2 of 3

#1Llama 3.1 405B Instruct

91.6%

#2Llama 3.1 70B Instruct

86.9%

#3Llama 3.1 8B Instruct

68.9%

All Benchmark Results for Llama 3.1 70B Instruct

Complete list of benchmark scores with detailed information


GSM-8K (CoT) GSM-8K (CoT) benchmark	math	text	0.95	95.1%	Self-reported
ARC-C ARC-C benchmark	reasoning	text	0.95	94.8%	Self-reported
API-Bank API-Bank benchmark	general	text	0.90	90.0%	Self-reported
IFEval IFEval benchmark	code	text	0.88	87.5%	Self-reported
Multilingual MGSM (CoT) Multilingual MGSM (CoT) benchmark	math	text	0.87	86.9%	Self-reported
MBPP ++ base version MBPP ++ base version benchmark	code	text	0.86	86.0%	Self-reported
MMLU (CoT) MMLU (CoT) benchmark	general	text	0.86	86.0%	Self-reported
BFCL BFCL benchmark	general	text	0.85	84.8%	Self-reported
MMLU MMLU benchmark	general	text	0.84	83.6%	Self-reported
HumanEval HumanEval benchmark	code	text	0.81	80.5%	Self-reported

Showing 1 to 10 of 18 benchmarks

Resources

API Reference Research Paper Blog Post Repository Model Weights