MiMo M2.5

MiMo M2.5 is the mid-tier model in Xiaomi's MiMo v2.5 family, a Mixture-of-Experts (MoE) stack with reasoning, tool use, and multimodal input. It supports a context window of 1.1M tokens and 131.1K tokens max output tokens.

ReasoningTool UseImplicit CachingFile InputVision (Image)

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'xiaomi/mimo-v2.5',
  prompt: 'Why is the sky blue?'
})

Overview About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out MiMo M2.5 by Xiaomi. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

MiMo M2.5

Ask MiMo M2.5 anything to try it out.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Xiaomi

1.1M

2.4s

112tps

$0.14/M

$0.28/M

Read:$0.0/M

Write:—

—

04/22/2026

DeepInfra

262K

0.6s

17tps

$0.40/M

$2.00/M

Read:$0.08/M

Write:—

—

04/22/2026

More models by Xiaomi

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

xiaomi/mimo-v2.5-pro

1.1M

0.4s

57tps

$0.43/M

$0.87/M

Read:$0.0/M

Write:—

—

04/22/2026

xiaomi/mimo-v2-pro

1.9s

59tps

$1.00/M

$3.00/M

Read:

$0.2/M

Write:

—

03/18/2026

xiaomi/mimo-v2-flash

262K

1.6s

107tps

$0.10/M

$0.30/M

Read:$0.01/M

Write:—

—

12/16/2025

About MiMo M2.5

MiMo M2.5 is a MoE language model from Xiaomi, released April 22, 2026 under the MIT license. Each forward pass activates a subset of total parameters, which keeps per-token compute lower than a dense model at the same parameter count.

The architecture uses hybrid attention, interleaving sliding-window and full attention to cut KV-cache storage at long sequence lengths. A multi-token prediction (MTP) block raises output tokens per step during inference. The full window of 1.1M tokens lets MiMo M2.5 reason over large documents, repos, or long agent trajectories.

MiMo M2.5 supports reasoning, tool calling, file input, vision, and implicit prompt caching. Call it through xiaomi, deepinfra via AI Gateway. For the higher-capability tier, see mimo-v2.5-pro.

What To Consider When Choosing a Provider

Configuration: MiMo M2.5 balances cost, capability, and context length. The MoE design keeps active compute small, but routing and serving a 300B-class MoE still requires capable infrastructure on the provider side. Use AI Gateway's cost tracking and model fallback to mix MiMo M2.5 with mimo-v2.5-pro on harder workloads.
Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use MiMo M2.5

Best For

Agentic Workflows: Tool-using agents that string together many calls in one session
Software Engineering: Code generation, refactors, and repo-scale analysis with a window of 1.1M tokens
Multimodal Input: Reasoning over mixed text, images, and uploaded files
Long Context: Documents, codebases, or chat histories that approach 1.1M tokens
Cost-Aware Reasoning: Lower active-parameter compute than dense models at similar scores

Consider Alternatives When

Maximum Reasoning Depth: mimo-v2.5-pro activates more parameters per step on the hardest math and engineering tasks
Speed-First Throughput: mimo-v2-flash is the throughput-tuned option in the previous generation
Simple Classification: A smaller, cheaper model handles short extraction at lower cost
Strict Text Pipelines: A text-only model is fine when your inputs are never images or files

Conclusion

MiMo M2.5 is the standard tier of Xiaomi's MiMo v2.5 family. Use it for agentic workflows, multimodal input, code, and long-context analysis. Pair it with mimo-v2.5-pro through AI Gateway routing so harder jobs land on the higher tier.

Agent Stack

Core Platform

Tools

Learn

Build

Explore

MiMo M2.5

Playground

Providers

More models by Xiaomi

About MiMo M2.5

What To Consider When Choosing a Provider

When to Use MiMo M2.5

Best For

Consider Alternatives When

Conclusion