Microsoft

microsoft.com

About

Technology company

Portfolio Stats

Total Models9

Multimodal2

Benchmarks Run141

Avg Performance68.7%

Latest Release

Phi 4 Reasoning Plus

Released: Apr 30, 2025

Release Timeline

Recent model releases by year

2025

5 models

2024

4 models

Performance Overview

Top models and benchmark performance

Top Performing Models

By avg score

#1Phi 4 Reasoning Plus

78.9%

#2Phi 4 Reasoning

75.1%

#3Phi-4-multimodal-instruct

72.0%

#4Phi-3.5-vision-instruct

68.3%

#5Phi 4 Mini Reasoning

68.0%

Benchmark Categories

reasoning

77.6%

code

74.8%

math

70.4%

factuality

69.3%

vision

67.0%

Model Statistics

Multimodal Ratio

22%

Models with Providers

All Models

Complete portfolio of 9 models with advanced filtering

		License
Phi 4 Reasoning Plus Phi-4-reasoning-plus is a state-of-the-art open-weight reasoning model finetuned from Phi-4 using supervised fine-tuning and reinforcement learning. It focuses on math, science, and coding skills. This 'plus' version has higher accuracy due to additional RL training but may have higher latency.	Apr 30, 2025	MIT	-	-	-	53.1%	-
Phi 4 Mini Reasoning Phi-4-mini-reasoning is designed for multi-step, logic-intensive mathematical problem-solving tasks under memory/compute constrained environments and latency bound scenarios. Some of the use cases include formal proof generation, symbolic computation, advanced word problems, and a wide range of mathematical reasoning scenarios. These models excel at maintaining context across steps, applying structured logic, and delivering accurate, reliable solutions in domains that require deep analytical thinking.	Apr 30, 2025	MIT	-	-	-	-	-
Phi 4 Reasoning Phi-4-reasoning is a state-of-the-art open-weight reasoning model finetuned from Phi-4 using supervised fine-tuning on a dataset of chain-of-thought traces and reinforcement learning. It focuses on math, science, and coding skills.	Apr 30, 2025	MIT	-	-	-	53.8%	-
Phi 4 Mini Phi 4 Mini Instruct is a lightweight (3.8B parameters) open model built upon synthetic data and filtered web data, focusing on high-quality reasoning. It supports a 128K token context length and is enhanced for instruction adherence and safety via supervised fine-tuning and direct preference optimization.	Feb 1, 2025	MIT	-	-	-	-	-
Phi-4-multimodal-instruct Phi-4-multimodal-instruct is a lightweight (5.57B parameters) open multimodal foundation model that leverages research and datasets from Phi-3.5 and 4.0. It processes text, image, and audio inputs to generate text outputs, supporting a 128K token context length. Enhanced via SFT, DPO, and RLHF for instruction following and safety.	Feb 1, 2025	MIT	-	-	-	-	-
Phi 4 phi-4 is a state-of-the-art open model built to excel at advanced reasoning, coding, and knowledge tasks. It leverages a blend of synthetic data, filtered web data, academic texts, and supervised fine-tuning for precision, alignment, and safety.	Dec 12, 2024	MIT	-	-	82.6%	-	-
Phi-3.5-MoE-instruct Phi-3.5-MoE-instruct is a mixture-of-experts model with ~42B total parameters (6.6B active) and a 128K context window. It excels at reasoning, math, coding, and multilingual tasks, outperforming larger dense models in many benchmarks. It underwent a thorough safety post-training process (SFT + DPO) and is licensed under MIT. This model is ideal for scenarios where efficiency and high performance are both required, particularly in multi-lingual or reasoning-intensive tasks.	Aug 23, 2024	MIT	-	-	70.7%	-	80.8%
Phi-3.5-mini-instruct Phi-3.5-mini-instruct is a 3.8B-parameter model that supports up to 128K context tokens, with improved multilingual capabilities across over 20 languages. It underwent additional training and safety post-training to enhance instruction-following, reasoning, math, and code generation. Ideal for environments with memory or latency constraints, it uses an MIT license.	Aug 23, 2024	MIT	-	-	62.8%	-	69.6%
Phi-3.5-vision-instruct Phi-3.5-vision-instruct is a 4.2B-parameter open multimodal model with up to 128K context tokens. It emphasizes multi-frame image understanding and reasoning, boosting performance on single-image benchmarks while enabling multi-image comparison, summarization, and even video analysis. The model underwent safety post-training for improved instruction-following, alignment, and robust handling of visual and text inputs, and is released under the MIT license.	Aug 23, 2024	MIT	-	-	-	-	-

Resources

Official Website