Model Overview

Description:

Mixtral 8x22B is MistralAI's latest open model. It sets a new standard for performance and efficiency within the AI community. It is a sparse Mixture-of-Experts (SMoE) model that uses only 39B active parameters out of 141B, offering unparalleled cost efficiency for its size.

Mixtral 8x22B comes with the following strengths:

It is fluent in English, French, Italian, German, and Spanish
It has strong mathematics and coding capabilities
It is natively capable of function calling; along with the constrained output mode implemented on la Plateforme, this enables application development and tech stack modernisation at scale
Its 64K tokens context window allows precise information recall from large documents

Third-Party Community Consideration:

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see Mixtral 8x22b's Model Card.

Terms of use

By using this software or model, you are agreeing to the terms and conditions of the license, acceptable use policy and Mistral's privacy policy. Mixtral-8x22B is released under the Apache 2.0 license

References(s):

Mixtral 8x22B Instruct Model Card on Hugging Face

Cheaper, Better, Faster, Stronger | Mistral AI

Model Architecture:

Architecture Type: Transformer

Network Architecture: Sparse Mixture of GPT-based experts

Model Version: 0.1

Input:

Input Format: Text

Input Parameters: Temperature, Top P, Max Output Tokens

Output:

Output Format: Text

Output Parameters: None

Software Integration:

Supported Hardware Platform(s): Hopper, Ampere, Turing, Ada

Supported Operating System(s): Linux

Inference:

Engine: Triton

Test Hardware: Other