Model Overview
Description:
Mistral-NeMo-Minitron-8B-Instruct is a model for generating responses for various text-generation tasks including roleplaying, retrieval augmented generation, and function calling. It is a fine-tuned version of nvidia/Mistral-NeMo-Minitron-8B-Base, which was pruned and distilled from Mistral-NeMo 12B using our LLM compression technique. The model was trained using a multi-stage SFT and preference-based alignment technique with NeMo Aligner. For details on the alignment technique, please refer to the Nemotron-4 340B Technical Report. The model supports a context length of 8,192 tokens.
License/Terms of Use:
Model Architecture:
Architecture Type: Transformer
Network Architecture: Decoder-only
Input:
Input Type(s): Text (Prompt)
Input Format(s): String
Input Parameters: One Dimensional (1D)
Other Properties Related to Input: The model has a maximum of 8192 input tokens.
Output:
Output Type(s): Text (Response)
Output Format: String
Output Parameters: 1D
Other Properties Related to Output: The model has a maximum of 8192 input tokens. Maximum output for both versions can be set apart from input.
Prompt Format:
We recommend using the following prompt template, which was used to fine-tune the model. The model may not perform optimally without it.
<extra_id_0>System
{system prompt}
<extra_id_1>User
{prompt}
<extra_id_1>Assistant\n
- Note that a newline character
\n
should be added at the end of the prompt. - We recommend using
<extra_id_1>
as a stop token.
Evaluation Results
Category | Benchmark | # Shots | Mistral-NeMo-Minitron-8B-Instruct |
---|---|---|---|
General | MMLU | 5 | 70.4 |
MT Bench (GPT4-Turbo) | 0 | 7.86 | |
Math | GMS8K | 0 | 87.1 |
Reasoning | GPQA | 0 | 31.5 |
Code | HumanEval | 0 | 71.3 |
MBPP | 0 | 72.5 | |
Instruction Following | IFEval | 0 | 84.4 |
Tool Use | BFCL v2 Live | 0 | 67.6 |
Software Integration: (Cloud)
Runtime Engine: NeMo Framework 24.09
Supported Hardware Microarchitecture Compatibility:
- [NVIDIA Ampere]
- [NVIDIA Blackwell]
- [NVIDIA Hopper]
- [NVIDIA Lovelace]
[Preferred/Supported] Operating System(s):
- Linux
Model Version(s)
Mistral-NeMo-Minitron 8B Instruct
Training & Evaluation:
Training Dataset:
** Data Collection Method by dataset
- Hybrid: Automated, Human
** Labeling Method by dataset
- Hybrid: Automated, Human
Evaluation Dataset:
** Data Collection Method by dataset
- Hybrid: Automated, Human
** Labeling Method by dataset
- Human
Inference:
Engine: TRT-LLM
Test Hardware:
- A100
- A10G
- H100
- L40S
Supported Hardware Platform(s): L40S, A10G, A100, H100
Ethical Considerations:
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards.
Please report security vulnerabilities or NVIDIA AI Concerns here.