
Qwen2.5-Omni-7B
Multimodal
Zero-eval
#1VocalSound
#1GiantSteps Tempo
#1MMBench-V1.1
+25 more
by Alibaba
About
Qwen2.5-Omni-7B is a multimodal language model developed by Alibaba. The model shows competitive results across 45 benchmarks. It excels particularly in DocVQA (95.2%), VocalSound (93.9%), GSM8k (88.7%). The model shows particular specialization in code tasks with an average performance of 76.0%. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Alibaba's latest advancement in AI technology.
Timeline
AnnouncedMar 27, 2025
ReleasedMar 27, 2025
Specifications
Capabilities
Multimodal
License & Family
License
Apache 2.0
Benchmark Performance Overview
Performance metrics and category breakdown
Overall Performance
45 benchmarks
Average Score
59.2%
Best Score
95.2%
High Performers (80%+)
8Top Categories
code
76.0%
vision
69.6%
math
63.3%
general
58.7%
roleplay
17.8%
Benchmark Performance
Top benchmark scores with normalized values (0-100%)
Ranking Across Benchmarks
Position relative to other models on each benchmark
DocVQA
Rank #3 of 26
#1Qwen2.5 VL 7B Instruct
95.7%
#2Qwen2.5 VL 72B Instruct
96.4%
#3Qwen2.5-Omni-7B
95.2%
#4Claude 3.5 Sonnet
95.2%
#5Mistral Small 3.2 24B Instruct
94.9%
#6Qwen2.5 VL 32B Instruct
94.8%
VocalSound
Rank #1 of 1
#1Qwen2.5-Omni-7B
93.9%
GSM8k
Rank #29 of 46
#26Claude 3 Haiku
88.9%
#27Gemma 3 4B
89.2%
#28Grok-1.5
90.0%
#29Qwen2.5-Omni-7B
88.7%
#30Phi-3.5-MoE-instruct
88.7%
#31Phi 4 Mini
88.6%
#32Jamba 1.5 Large
87.0%
GiantSteps Tempo
Rank #1 of 1
#1Qwen2.5-Omni-7B
88.0%
ChartQA
Rank #14 of 24
#11Llama 3.2 90B Instruct
85.5%
#12GPT-4o
85.7%
#13DeepSeek VL2
86.0%
#14Qwen2.5-Omni-7B
85.3%
#15DeepSeek VL2 Small
84.5%
#16Llama 3.2 11B Instruct
83.4%
#17Pixtral-12B
81.8%
All Benchmark Results for Qwen2.5-Omni-7B
Complete list of benchmark scores with detailed information
DocVQA DocVQA benchmark | vision | multimodal | 0.95 | 95.2% | Self-reported |
VocalSound VocalSound benchmark | general | text | 0.94 | 93.9% | Self-reported |
GSM8k GSM8k benchmark | math | text | 0.89 | 88.7% | Self-reported |
GiantSteps Tempo GiantSteps Tempo benchmark | general | text | 0.88 | 88.0% | Self-reported |
ChartQA ChartQA benchmark | general | multimodal | 0.85 | 85.3% | Self-reported |
TextVQA TextVQA benchmark | vision | multimodal | 0.84 | 84.4% | Self-reported |
AI2D AI2D benchmark | general | text | 0.83 | 83.2% | Self-reported |
MMBench-V1.1 MMBench-V1.1 benchmark | general | text | 0.82 | 81.8% | Self-reported |
HumanEval HumanEval benchmark | code | text | 0.79 | 78.7% | Self-reported |
CRPErelation CRPErelation benchmark | general | text | 0.77 | 76.5% | Self-reported |
Showing 1 to 10 of 45 benchmarks
...