GLM-4.5-Air

Name: GLM-4.5-Air
Rating: 60.8 (14 reviews)
Author: Zhipu AI

Zero-eval

#1TAU-bench Airline

#2MATH-500

#2BFCL-v3

+3 more

by Zhipu AI

About

GLM-4.5-Air is a language model developed by Zhipu AI. It achieves strong performance with an average score of 60.8% across 14 benchmarks. It excels particularly in MATH-500 (98.1%), AIME 2024 (89.4%), MMLU-Pro (81.4%). It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Zhipu AI's latest advancement in AI technology.

Timeline

AnnouncedJul 28, 2025

ReleasedJul 28, 2025

Specifications

License & Family

License

MIT

Benchmark Performance Overview

Performance metrics and category breakdown

Overall Performance

14 benchmarks

Average Score

60.8%

Best Score

98.1%

High Performers (80%+)

Top Categories

math

98.1%

general

76.0%

agents

53.3%

code

46.0%

reasoning

37.7%

Additional Information

Content coming soon...

All Benchmark Results for GLM-4.5-Air

Complete list of benchmark scores with detailed information


MATH-500	math	text	0.98	98.1%	Self-reported
AIME 2024	general	text	0.89	89.4%	Self-reported
MMLU-Pro	general	text	0.81	81.4%	Self-reported
TAU-bench Retail	agents	text	0.78	77.9%	Self-reported
BFCL-v3	general	text	0.76	76.4%	Self-reported
GPQA	general	text	0.75	75.0%	Self-reported
LiveCodeBench	code	text	0.71	70.7%	Self-reported
AA-Index	reasoning	text	0.65	64.8%	Self-reported
TAU-bench Airline	agents	text	0.61	60.8%	Self-reported
SWE-Bench Verified	general	text	0.58	57.6%	Self-reported

Showing 1 to 10 of 14 benchmarks

Resources

API Reference Playground Research Paper Blog Post Repository Model Weights