Qwen 3.7 Max

Qwen 3.7 Max is Alibaba's flagship agent-tuned model in the Qwen 3.7 line, with a context window of 991K tokens and an emphasis on long-horizon tool use, multi-file coding, and office workflow automation.

Implicit CachingReasoningTool Use

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'alibaba/qwen3.7-max',
  prompt: 'Why is the sky blue?'
})

Overview About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out Qwen 3.7 Max by Alibaba. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Qwen 3.7 Max

Ask Qwen 3.7 Max anything to try it out.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Alibaba

991K

1.9s

57tps

$1.25/M

$3.75/M

Read:$0.25/M

Write:$1.56/M

—

05/21/2026

More models by Alibaba

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

alibaba/qwen3.7-plus

1.3s

303tps

$0.32/M

$1.28/M

Read:$0.08/M

Write:$0.5/M

—

06/02/2026

alibaba/qwen-3.6-max-preview

240K

1.3s

84tps

$1.30/M

$7.80/M

Read:

$0.26/M

Write:

$1.63/M

—

04/20/2026

alibaba/qwen3.6-plus

0.2s

118tps

$0.50/M

$3.00/M

Read:

$0.1/M

Write:

$0.63/M

—

04/02/2026

alibaba/qwen3.5-flash

0.9s

144tps

$0.10/M

$0.40/M

Read:$0.0/M

Write:$0.13/M

—

02/24/2026

alibaba/qwen3-embedding-8b

33K

$0.05/M

—

06/05/2025

alibaba/qwen-3-14b

41K

0.3s

63tps

$0.12/M

$0.24/M

—

04/28/2025

About Qwen 3.7 Max

Qwen 3.7 Max is the Max-tier release in the Qwen 3.7 generation, succeeding Qwen3.6-Max-Preview in Alibaba's closed-weight API line. The model is served through alibaba with a context window of 991K tokens, which suits full-repository ingestion, long agent traces, and multi-document analysis without segmentation.

Alibaba describes Qwen 3.7 Max as designed as an agent foundation. The model targets coding agents that plan and act across many turns, office and productivity tasks that route work through multi-agent orchestration, and long-horizon autonomous execution where the model must maintain coherent reasoning across hundreds of sequential tool calls. Reported improvements over Qwen3.6-Max-Preview concentrate in frontend prototyping and complex multi-file engineering work.

Like other Max-tier entries, Qwen 3.7 Max supports tool calling and structured outputs, with extended-thinking mode available for high-difficulty reasoning, scientific computation, and expert-level queries. The thinking budget can be tuned per request to balance depth of reasoning against latency and token spend. Qwen 3.7 Max is text-only; for vision input, the sibling Qwen3.7-Plus is the multimodal entry in the 3.7 lineup.

You can integrate Qwen 3.7 Max through AI SDK, Chat Completions API, Responses API, Messages API, or other API formats, from TypeScript or Python.

What To Consider When Choosing a Provider

Configuration: Agent workflows that chain hundreds of tool calls produce high output-token volume. Use the AI Gateway cost dashboard to monitor per-session spend and tune the thinking budget before running production traffic at scale.
Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Qwen 3.7 Max

Best For

Long-Horizon Coding Agents: Sustained tool-calling sessions across many turns with planning, retries, and dead-end recovery
Multi-File Software Engineering: Refactoring, diff editing, and frontend prototyping across a repository
Office Workflow Automation: Routing productivity tasks through multi-agent orchestration
Expert Reasoning Tasks: Scientific computation, mathematics, and structured analysis with extended-thinking mode
Repository Ingestion: Long-context workloads using the window of 991K tokens for full codebases and tool traces

Consider Alternatives When

Vision Or Multimodal Input: Qwen3.7-Plus is the multimodal entry in the 3.7 line when image inputs are needed
Latency-Sensitive Pipelines: A Plus or Flash-tier model serves users better when extended-thinking traces add unnecessary overhead
Strict Token Budgets: A smaller model is a closer fit when per-session spend on a flagship Max model isn't justified
Built-In Autonomous Search: Qwen3-Max-Thinking is a stronger fit when integrated search and code interpreter tools are required

Conclusion

Qwen 3.7 Max extends the Qwen Max tier with an agent-first design that targets long-horizon tool use, multi-file coding, and office workflow automation. Routing through AI Gateway gives you a single integration surface, provider failover, and consolidated billing while you build against the latest generation in the Max line.

Agent Stack

Core Platform

Tools

Learn

Build

Explore

Qwen 3.7 Max

Playground

Providers

More models by Alibaba

About Qwen 3.7 Max

What To Consider When Choosing a Provider

When to Use Qwen 3.7 Max

Best For

Consider Alternatives When

Conclusion