Model Overview

Description:

Mistral Large 2 is the new generation of Mistral's flagship model, offering significant improvements in code generation, mathematics, and reasoning. It features advanced multilingual support and enhanced function-calling capabilities. Mistral Large 2 is designed for single-node inference with a 128k context window and 123 billion parameters, allowing it to handle long-context applications efficiently.

This model is ready for research and non-commercial use.

Third-Party Community Consideration:

Mistral Large 2 is developed by Mistral AI and is available under the Mistral Research License for research and non-commercial use. For commercial use requiring self-deployment, a Mistral Commercial License must be acquired by contacting Mistral AI.

Terms of Use

By using this software or model, you agree to the terms and conditions of the license, acceptable use policy, and Mistral's privacy policy. Mistral Large 2 is available under the Mistral Research License.

References(s):

Mistral Large 2 blogpost

Model Architecture:

Architecture Type: Transformer

Network Architecture: Mistral Large 2

Model Version: 24.07

Input:

Input Type: Text

Input Format: String

Input Parameters: Max Tokens, Temperature, Top P

Max Input Tokens: 128,000

Output:

Output Type: Text

Output Format: Text

Max Output Tokens: 128,000

Software Integration:

Supported Hardware Platform(s): NVIDIA Ampere, NVIDIA Hopper

Supported Operating System(s): Linux

Inference:

Engine: TRT-LLM

Test Hardware: H100

Benchmarks:

Mistral Large 2 sets a new standard in performance and cost efficiency. It achieves an accuracy of 84.0% on MMLU and demonstrates competitive performance on code generation benchmarks, performing on par with leading models such as GPT-4o, Claude 3 Opus, and Llama 3 405B. It excels in mathematical reasoning, achieving high scores on GSM8K and MATH benchmarks.

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report security vulnerabilities or NVIDIA AI Concerns here.