MiMo V2.5 Pro
MiMo V2.5 Pro is the Pro tier of Xiaomi's MiMo v2.5 family, a Mixture-of-Experts (MoE) reasoning model built for agentic workflows, software engineering, and long-horizon tasks. It supports a context window of 1.1M tokens and 1.0M tokens max output tokens.
import { streamText } from 'ai'
const result = streamText({ model: 'xiaomi/mimo-v2.5-pro', prompt: 'Why is the sky blue?'})Playground
Try out MiMo V2.5 Pro by Xiaomi. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Ask MiMo V2.5 Pro anything to try it out.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by Xiaomi
| Model |
|---|
About MiMo V2.5 Pro
MiMo V2.5 Pro is the Pro variant in Xiaomi's MiMo v2.5 family, released April 22, 2026 under the MIT license. Compared to the standard tier, Pro activates a larger share of a larger parameter pool per token, which raises reasoning depth at higher per-token cost.
Like the rest of the line, MiMo V2.5 Pro uses a Mixture-of-Experts (MoE) stack with hybrid attention. Sliding-window and full attention combine in a fixed ratio, which cuts KV-cache storage versus dense attention at the same sequence length. Three multi-token prediction (MTP) blocks raise output tokens per inference step. The full window of 1.1M tokens fits long agent traces, repos, or document sets.
MiMo V2.5 Pro supports reasoning, tool calling, file input, vision, and implicit prompt caching. Call it through xiaomi, deepinfra via AI Gateway. For lower-cost everyday work, see mimo-v2.5.
What To Consider When Choosing a Provider
- Configuration: MiMo V2.5 Pro sits at the Pro end of MiMo v2.5. Per-token cost is higher than
mimo-v2.5, but accuracy on hard math, agentic, and engineering tasks is the reason to pick it. Use AI Gateway's routing and fallback to send easy work tomimo-v2.5and reserve MiMo V2.5 Pro for the requests where reasoning depth pays off. - Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use MiMo V2.5 Pro
Best For
- Long-Horizon Agents: Trajectories that span thousands of tool calls in a single run
- Complex Software Engineering: Issue resolution, repo-level edits, and multi-file refactors
- Math and Proofs: Long logical chains where intermediate reasoning steps matter
- Long-Context Reasoning: Documents and codebases approaching 1.1M tokens
- Pro-Tier MoE: Higher active-parameter compute for harder reasoning workloads
Consider Alternatives When
- Throughput-Sensitive Workloads:
mimo-v2.5runs cheaper per token for everyday agent or code work - Short Prompt-and-Reply Calls: A smaller model is enough when you don't need deep reasoning
- Speed-First Tasks:
mimo-v2-flashfrom the previous generation is throughput-tuned - Simple Extraction Jobs: A lightweight model handles classification at lower cost
Conclusion
MiMo V2.5 Pro is the Pro pick in Xiaomi's MiMo v2.5 lineup. Use it for long-horizon agents, complex software engineering, and math-heavy reasoning. Pair it with mimo-v2.5 through AI Gateway routing so you can balance cost and quality across a mixed workload.