Qwen2.5 VL 72B Instruct

Name: Qwen2.5 VL 72B Instruct
Rating: 66.9 (30 reviews)
Author: Alibaba

Multimodal

Zero-eval

#1DocVQA

#1Android Control Low_EM

#1OCRBench

+24 more

by Alibaba

About

Qwen2.5 VL 72B Instruct is a multimodal language model developed by Alibaba. It achieves strong performance with an average score of 66.9% across 30 benchmarks. It excels particularly in DocVQA (96.4%), Android Control Low_EM (93.7%), ChartQA (89.5%). As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2025, it represents Alibaba's latest advancement in AI technology.

Timeline

AnnouncedJan 26, 2025

ReleasedJan 26, 2025

Specifications

Capabilities

Multimodal

License & Family

License

tongyi-qianwen

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

30 benchmarks

Average Score

66.9%

Best Score

96.4%

High Performers (80%+)

Top Categories

general

69.6%

vision

58.6%

math

56.5%

Benchmark Performance

Top benchmark scores with normalized values (0-100%)

Ranking Across Benchmarks

Position relative to other models on each benchmark

DocVQA

Rank #1 of 26

#1Qwen2.5 VL 72B Instruct

96.4%

#2Qwen2.5 VL 7B Instruct

95.7%

#3Qwen2.5-Omni-7B

95.2%

#4Claude 3.5 Sonnet

95.2%

Android Control Low_EM

Rank #1 of 3

#1Qwen2.5 VL 72B Instruct

93.7%

#2Qwen2.5 VL 32B Instruct

93.3%

#3Qwen2.5 VL 7B Instruct

91.4%

ChartQA

Rank #3 of 24

#1Llama 4 Maverick

90.0%

#2Claude 3.5 Sonnet

90.8%

#3Qwen2.5 VL 72B Instruct

89.5%

#4Nova Pro

89.2%

#5Llama 4 Scout

88.8%

#6Qwen2-VL-72B-Instruct

88.3%

OCRBench

Rank #1 of 7

#1Qwen2.5 VL 72B Instruct

88.5%

#2Qwen2-VL-72B-Instruct

87.7%

#3Qwen2.5 VL 7B Instruct

86.4%

#4Phi-4-multimodal-instruct

84.4%

AI2D

Rank #7 of 17

#4Llama 3.2 11B Instruct

91.1%

#5Llama 3.2 90B Instruct

92.3%

#6Mistral Small 3.2 24B Instruct

92.9%

#7Qwen2.5 VL 72B Instruct

88.4%

#8Grok-1.5V

88.3%

#9Gemma 3 27B

84.5%

#10Gemma 3 12B

84.2%

All Benchmark Results for Qwen2.5 VL 72B Instruct

Complete list of benchmark scores with detailed information


DocVQA DocVQA benchmark	vision	multimodal	0.96	96.4%	Self-reported
Android Control Low_EM Android Control Low_EM benchmark	general	text	0.94	93.7%	Self-reported
ChartQA ChartQA benchmark	general	multimodal	0.90	89.5%	Self-reported
OCRBench OCRBench benchmark	general	text	0.89	88.5%	Self-reported
AI2D AI2D benchmark	general	text	0.88	88.4%	Self-reported
MMBench MMBench benchmark	general	text	0.88	88.0%	Self-reported
ScreenSpot ScreenSpot benchmark	general	text	0.87	87.1%	Self-reported
AITZ_EM AITZ_EM benchmark	general	text	0.83	83.2%	Self-reported
CC-OCR CC-OCR benchmark	general	text	0.80	79.8%	Self-reported
EgoSchema EgoSchema benchmark	general	text	0.76	76.2%	Self-reported

Showing 1 to 10 of 30 benchmarks

Resources

Playground Research Paper Blog Post Repository Model Weights