GLM 4.7 FlashX

GLM 4.7 FlashX is the ultra-fast inference variant in Z.ai's GLM-4.7 generation, released January 19, 2026. Designed for the lowest latency workloads, it provides the fastest response times in the GLM-4.7 family while retaining core coding and reasoning capabilities.

ReasoningTool UseImplicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'zai/glm-4.7-flashx',
  prompt: 'Why is the sky blue?'
})

Overview About Providers Throughput Latency Uptime Status Similar FAQ

More models by Z.ai

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

zai/glm-5.2-fast

2.2s

184tps

$3.00/M

$10.25/M

Read:$0.5/M

Write:—

—

06/23/2026

zai/glm-5.2

0.8s

200tps

$1.40/M

$4.40/M

Read:$0.26/M

Write:—

—

06/16/2026

zai/glm-5.1

205K

0.5s

174tps

$1.30/M

$4.30/M

Read:$0.26/M

Write:—

—

04/07/2026

zai/glm-5-turbo

203K

4.3s

45tps

$1.20/M

$4.00/M

Read:$0.24/M

Write:—

—

03/15/2026

zai/glm-5

203K

0.5s

81tps

$0.80/M

$2.56/M

Read:$0.16/M

Write:—

—

02/12/2026

zai/glm-4.7

205K

0.1s

525tps

$2.25/M

$2.75/M

Read:$2.25/M

Write:—

—

12/22/2025

Agent Stack

Core Platform

Tools

Learn

Build

Explore

GLM 4.7 FlashX

More models by Z.ai