Qwen2.5 VL 7B Instruct

Name: Qwen2.5 VL 7B Instruct
Rating: 64.5 (32 reviews)
Author: Alibaba

Multimodal

Zero-eval

#1MobileMiniWob++_SR

#1MLVU

#1MMT-Bench

+21 more

by Alibaba

About

Qwen2.5 VL 7B Instruct is a multimodal language model developed by Alibaba. It achieves strong performance with an average score of 64.5% across 32 benchmarks. It excels particularly in DocVQA (95.7%), MobileMiniWob++_SR (91.4%), Android Control Low_EM (91.4%). As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Alibaba's latest advancement in AI technology.

Timeline

AnnouncedJan 26, 2025

ReleasedJan 26, 2025

Specifications

Capabilities

Multimodal

License & Family

License

Apache 2.0

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

32 benchmarks

Average Score

64.5%

Best Score

95.7%

High Performers (80%+)

Top Categories

general

67.7%

roleplay

63.6%

vision

61.5%

math

46.6%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

DocVQA

Rank #2 of 26

#1Qwen2.5 VL 72B Instruct

96.4%

#2Qwen2.5 VL 7B Instruct

95.7%

#3Qwen2.5-Omni-7B

95.2%

#4Claude 3.5 Sonnet

95.2%

#5Mistral Small 3.2 24B Instruct

94.9%

MobileMiniWob++_SR

Rank #1 of 2

#1Qwen2.5 VL 7B Instruct

91.4%

#2Qwen2.5 VL 72B Instruct

68.0%

Android Control Low_EM

Rank #3 of 3

#1Qwen2.5 VL 32B Instruct

93.3%

#2Qwen2.5 VL 72B Instruct

93.7%

#3Qwen2.5 VL 7B Instruct

91.4%

ChartQA

Rank #9 of 24

#6Mistral Small 3.2 24B Instruct

87.4%

#7Pixtral Large

88.1%

#8Qwen2-VL-72B-Instruct

88.3%

#9Qwen2.5 VL 7B Instruct

87.3%

#10Nova Lite

86.8%

#11DeepSeek VL2

86.0%

#12GPT-4o

85.7%

OCRBench

Rank #3 of 7

#1Qwen2-VL-72B-Instruct

87.7%

#2Qwen2.5 VL 72B Instruct

88.5%

#3Qwen2.5 VL 7B Instruct

86.4%

#4Phi-4-multimodal-instruct

84.4%

#5DeepSeek VL2 Small

83.4%

#6DeepSeek VL2

81.1%

All Benchmark Results for Qwen2.5 VL 7B Instruct

Complete list of benchmark scores with detailed information


DocVQA DocVQA benchmark	vision	multimodal	0.96	95.7%	Self-reported
MobileMiniWob++_SR MobileMiniWob++_SR benchmark	general	text	0.91	91.4%	Self-reported
Android Control Low_EM Android Control Low_EM benchmark	general	text	0.91	91.4%	Self-reported
ChartQA ChartQA benchmark	general	multimodal	0.87	87.3%	Self-reported
OCRBench OCRBench benchmark	general	text	0.86	86.4%	Self-reported
TextVQA TextVQA benchmark	vision	multimodal	0.85	84.9%	Self-reported
ScreenSpot ScreenSpot benchmark	general	text	0.85	84.7%	Self-reported
MMBench MMBench benchmark	general	text	0.84	84.3%	Self-reported
InfoVQA InfoVQA benchmark	vision	multimodal	0.83	82.6%	Self-reported
AITZ_EM AITZ_EM benchmark	general	text	0.82	81.9%	Self-reported

Showing 1 to 10 of 32 benchmarks

Resources

Playground Research Paper Blog Post Repository Model Weights