minimaxai / minimax-m2.5

MiniMax-M2.5 Overview

Description:

MiniMax-M2.5 is a text generation model trained to perform complex agentic tasks, including software engineering, tool use, search, and office-work style workflows. It is extensively trained with reinforcement learning in hundreds of thousands of complex real-world environments, M2.5 is SOTA in coding, agentic tool use and search, office work, and a range of other economically valuable tasks, boasting scores of 80.2% in SWE-Bench Verified, 51.3% in Multi-SWE-Bench, and 76.3% in BrowseComp (with context management). Trained to reason efficiently and decompose tasks optimally, M2.5 exhibits tremendous speed in performing complicated agentic tasks, completing the SWE-Bench Verified evaluation 37% faster than M2.1, matching the speed of Claude Opus 4.6.
MiniMax-M2.5 was developed by MiniMaxAI.
This model is ready for commercial/non-commercial use.

Third-Party Community Consideration:

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party's requirements for this application and use case; see link to Non-NVIDIA MiniMax-M2.5 Model Card.

License and Terms of Use:

GOVERNING TERMS: The trial service is governed by the NVIDIA API Trial Terms of Service; and use of this model is governed by the NVIDIA Open Model License. ADDITIONAL INFORMATION: Modified MIT License. MiniMax M2.5.

Deployment Geography:

Global

Use Case:

Enterprises and developers building AI agents, chatbots, and tool-using applications across coding, office work, and information-retrieval tasks. The model is suited for NLP workloads that require advanced reasoning, long-context handling, and agentic tool use, including:

Coding and software engineering assistance (e.g., SWE-bench style tasks)
Search and tool calling workflows
Office productivity tasks (e.g., document/spreadsheet oriented workflows)
General conversational assistant use

Release Date:

HuggingFace 02/12/2026 via MiniMaxAI/MiniMax-M2.5
Build.NVIDIA.com 02/26/2026 via link
NGC 02/26/2026 via MiniMax-M2.5 on NGC

Reference(s):

Model Architecture:

Architecture Type: Transformer
Network Architecture: Mixture of Experts (MoE) with Lightning Attention, 8 experts per token (MiniMaxM2ForCausalLM)
Total Parameters: Undisclosed
Active Parameters: Undisclosed
Vocabulary Size: Undisclosed
Base Model: MiniMax M2-series (e.g., MiniMax-M2.1)

Input:

Input Types: Text
Input Formats: String
Input Parameters: One Dimensional (1D)
Other Input Properties: Context length up to 204,800 tokens.

Output:

Output Types: Text
Output Format: String
Output Parameters: One Dimensional (1D)
Other Output Properties: Autoregressive text generation (may include tool-calling structured outputs depending on serving stack).

Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.

Software Integration:

Runtime Engines:

SGLang: via NVIDIA NIM

Supported Hardware:

NVIDIA Blackwell: B200
NVIDIA Hopper: H100, H200, H20, H20-3e

Operating Systems: Linux

The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.

Model Version(s)

MiniMaxAI/MiniMax-M2.5

Training, Testing, and Evaluation Datasets:

Training Dataset

Data Modality: Text
Text Training Data Size: Undisclosed
Data Collection Method by dataset: Undisclosed
Labeling Method by dataset: Undisclosed
Properties (Quantity, Dataset Descriptions, Sensor(s)): The model is described as trained across 10+ programming languages and 200,000+ real-world environments, with extensive reinforcement learning over complex environments.

Testing Dataset

Data Collection Method by dataset: Undisclosed
Labeling Method by dataset: Undisclosed
Properties (Quantity, Dataset Descriptions, Sensor(s)): Undisclosed

Evaluation Dataset

Benchmark Score: SWE-Bench Verified (80.2%), Multi-SWE-Bench (51.3%), BrowseComp (76.3%)
Data Collection Method by dataset: Automated
Labeling Method by dataset: Automated
Properties (Quantity, Dataset Descriptions, Sensor(s)): Evaluated on a mix of coding, tool-use, web-browsing, and multi-step reasoning benchmarks such as SWE-Bench, Terminal Bench 2, VIBE-Pro, BrowseComp, Wide Search, RISE, GDPval-MM, MEWC, Finance Modeling, as well as standard academic benchmarks (AIME25, GPQA-D, HLE w/o tools, SciCode, IFBench, AA-LCR).

Benchmark	MiniMax-M2.5
AIME25	86.3
GPQA-D	85.2
HLE w/o tools	19.4
SciCode	44.4
IFBench	70.0
AA-LCR	69.5

Benchmark	Description
SWE-bench Verified	Coding agent benchmark
SWE-bench Multilingual	Multilingual coding benchmark
SWE-bench-pro	Professional coding benchmark
Multi-SWE-bench	Combined coding benchmark
Terminal Bench 2	Terminal tool-use benchmark
VIBE-Pro	Visual-interactive benchmark
BrowseComp	Web-browsing benchmark
Wide Search	Search benchmark
RISE	Multi-step information-retrieval benchmark
GDPval-MM	Multi-modal evaluation benchmark
MEWC	Excel-world-championship benchmark
Finance Modeling	Financial modeling benchmark

Inference

Acceleration Engine: SGLang
Test Hardware:

NVIDIA B200
NVIDIA H100
NVIDIA H200
NVIDIA H20
NVIDIA H20-3e

Additional Details

The model can be integrated via multiple runtimes: Transformers (loading from Hugging Face with trust_remote_code=True), vLLM, SGLang, KTransformers, and other supported engines.

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.