DeepSeek V4 Flash

DeepSeek V4 Flash is DeepSeek's April 23, 2026 efficiency-tier model in the V4 series. It pairs a hybrid attention architecture with a context window of 1.0M tokens and supports reasoning, tool use, and implicit caching.

ReasoningTool UseImplicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'deepseek/deepseek-v4-flash',
  prompt: 'Why is the sky blue?'
})

Overview About Providers Throughput Latency Uptime Status Similar FAQ

More models by DeepSeek

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

deepseek/deepseek-v4-pro

0.5s

192tps

$1.74/M$0.43/M

$3.48/M$0.87/M

Read:$0.0/M

Write:—

—

04/23/2026

deepseek/deepseek-v3.2

164K

0.4s

82tps

$0.28/M

$0.42/M

Read:$0.03/M

Write:—

—

12/01/2025

deepseek/deepseek-v3.2-thinking

164K

1.0s

56tps

$0.26/M

$0.38/M

Read:$0.13/M

Write:—

—

12/01/2025

deepseek/deepseek-v3.1-terminus

131K

1.6s

29tps

$0.27/M

$1.00/M

Read:$0.14/M

Write:—

—

09/22/2025

deepseek/deepseek-v3.1

164K

0.9s

42tps

$0.21/M

$0.79/M

Read:$0.13/M

Write:—

—

08/21/2025

deepseek/deepseek-r1

160K

0.3s

116tps

$1.35/M

$5.40/M

Read:$0.35/M

Write:—

—

01/20/2025

Agent Stack

Core Platform

Tools

Learn

Build

Explore

DeepSeek V4 Flash

More models by DeepSeek