Phi-3.5-mini-instruct

Name: Phi-3.5-mini-instruct
Price: 0.1 USD
Rating: 58.7 (31 reviews)
Author: Microsoft

Zero-eval

#1Qasper

#1SQuALITY

#1QMSum

+11 more

by Microsoft

About

Phi-3.5-mini-instruct is a language model developed by Microsoft. The model shows competitive results across 31 benchmarks. It excels particularly in GSM8k (86.2%), ARC-C (84.6%), RULER (84.1%). It supports a 256K token context window for handling large documents. The model is available through 1 API provider. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Microsoft's latest advancement in AI technology.

Pricing Range

Input (per 1M)$0.10 -$0.10

Output (per 1M)$0.10 -$0.10

Providers1

Timeline

AnnouncedAug 23, 2024

ReleasedAug 23, 2024

Specifications

Training Tokens3.4T

License & Family

License

MIT

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

31 benchmarks

Average Score

58.7%

Best Score

86.2%

High Performers (80%+)

Performance Metrics

Max Context Window

256.0K

Avg Throughput

23.0 tok/s

Avg Latency

1ms

Top Categories

reasoning

74.2%

code

66.2%

factuality

64.0%

math

60.9%

general

55.4%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

GSM8k

Rank #33 of 46

#30Jamba 1.5 Large

87.0%

#31Phi 4 Mini

88.6%

#32Phi-3.5-MoE-instruct

88.7%

#33Phi-3.5-mini-instruct

86.2%

#34Gemini 1.5 Flash

86.2%

#35Qwen2.5-Coder 7B Instruct

83.9%

#36Qwen2 7B Instruct

82.3%

ARC-C

Rank #13 of 31

#10Jamba 1.5 Mini

85.7%

#11Claude 3 Haiku

89.2%

#12Nova Micro

90.2%

#13Phi-3.5-mini-instruct

84.6%

#14Phi 4 Mini

83.7%

#15Llama 3.1 8B Instruct

83.4%

#16Llama 3.2 3B Instruct

78.6%

RULER

Rank #2 of 2

#1Phi-3.5-MoE-instruct

87.1%

#2Phi-3.5-mini-instruct

84.1%

PIQA

Rank #4 of 9

#1Gemma 2 9B

81.7%

#2Gemma 2 27B

83.2%

#3Phi-3.5-MoE-instruct

88.6%

#4Phi-3.5-mini-instruct

81.0%

#5Gemma 3n E4B Instructed LiteRT Preview

81.0%

#6Gemma 3n E4B

81.0%

#7Gemma 3n E2B Instructed LiteRT (Preview)

78.9%

OpenBookQA

Rank #3 of 4

#1Phi 4 Mini

79.2%

#2Phi-3.5-MoE-instruct

89.6%

#3Phi-3.5-mini-instruct

79.2%

#4Mistral NeMo Instruct

60.6%

All Benchmark Results for Phi-3.5-mini-instruct

Complete list of benchmark scores with detailed information


GSM8k GSM8k benchmark	math	text	0.86	86.2%	Self-reported
ARC-C ARC-C benchmark	reasoning	text	0.85	84.6%	Self-reported
RULER RULER benchmark	general	text	0.84	84.1%	Self-reported
PIQA PIQA benchmark	general	text	0.81	81.0%	Self-reported
OpenBookQA OpenBookQA benchmark	general	text	0.79	79.2%	Self-reported
BoolQ BoolQ benchmark	general	text	0.78	78.0%	Self-reported
RepoQA RepoQA benchmark	general	text	0.77	77.0%	Self-reported
Social IQa Social IQa benchmark	general	text	0.75	74.7%	Self-reported
MEGA XStoryCloze MEGA XStoryCloze benchmark	general	text	0.73	73.5%	Self-reported
MBPP MBPP benchmark	code	text	69.60	69.6%	Self-reported

Showing 1 to 10 of 31 benchmarks

Resources

API Reference Research Paper Blog Post Model Weights