Skip to content
Dashboard

DeepSeek V4 Flash

DeepSeek V4 Flash is DeepSeek's April 23, 2026 efficiency-tier model in the V4 series. It pairs a hybrid attention architecture with a context window of 1.0M tokens and supports reasoning, tool use, and implicit caching.

ReasoningTool UseImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'deepseek/deepseek-v4-flash',
prompt: 'Why is the sky blue?'
})

More models by DeepSeek

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.5s
192tps
$1.74/M$0.43/M
$3.48/M$0.87/M
Read:$0.0/M
Write:
+1
azure logo
baseten logo
deepinfra logo
+4
04/23/2026
164K
0.4s
82tps
$0.28/M$0.42/M
Read:$0.03/M
Write:
bedrock logo
deepinfra logo
deepseek logo
+1
12/01/2025
164K
1.0s
56tps
$0.26/M$0.38/M
Read:$0.13/M
Write:
+1
bedrock logo
deepinfra logo
fireworks logo
+1
12/01/2025
131K
1.6s
29tps
$0.27/M$1.00/M
Read:$0.14/M
Write:
+1
novita logo
09/22/2025
164K
0.9s
42tps
$0.21/M$0.79/M
Read:$0.13/M
Write:
+1
deepinfra logo
novita logo
sambanova logo
+1
08/21/2025
160K
0.3s
116tps
$1.35/M$5.40/M
Read:$0.35/M
Write:
+1
bedrock logo
deepinfra logo
01/20/2025