nvidia / nemotron-4-mini-hindi-4b-instruct

Model Overview

Description:

Nemotron-4-Mini-Hindi-4B-Instruct is a chat model for generating responses for chat application and retrieval augmented generation in Hindi. It is a aligned version of Nemotron-4-Mini-Hindi-4B-Base . It is a small language model (SLM) optimized through distillation, pruning, and quantization for speed and on-device deployment. VRAM usage has been minimized to approximately 2 GB, providing significantly faster time to first token compared to LLMs.

This model is ready for commercial use.

License/Terms of Use:

NVIDIA Open Model License

References

Please refer to the User Guide to use the model and use a suggested guideline for prompts.

Model Architecture:

Architecture Type: Transformer

Network Architecture: Decoder-only

Limitations

The model was trained on data that contains toxic language and societal biases originally crawled from the internet. Therefore, the model may amplify those biases and return toxic responses especially when prompted with toxic prompts. The model may generate answers that may be inaccurate, omit key information, or include irrelevant or redundant text producing socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive. The model may answer with I statements, exhibiting some anthropomorphizing. This issue could be exacerbated without the use of the recommended prompt template.

Input:

Input Type(s): Text (Prompt)

Input Format(s): String

Input Parameters: One Dimensional (1D)

Other Properties Related to Input: The model has a maximum of 4096 input tokens.

Output:

Output Type(s): Text (Response)

Output Format: String

Output Parameters: 1D

Other Properties Related to Output: The model has a maximum of 4096 input tokens. Maximum output for both versions can be set apart from input.

Prompt Format:

We recommend using the following prompt template, which was used to fine-tune the model. The model may not perform optimally without it.

Single Turn

<extra_id_0>System
{system prompt}

<extra_id_1>User
{prompt}
<extra_id_1>Assistant\n
  • Note that a newline character \n should be added at the end of the prompt.
  • We recommend using <extra_id_1> as a stop token.

Evaluation Results

CategoryHindi Benchmark# ShotsNemotron-4-Mini-Hindi-4B-Instruct
GeneralMMLU050.5
ARC-C065.53
ARC-E079.97
Hella Swag039.9
BoolQ067.86
ChatIndicQuest (GPT4-Turbo)04.15

Software Integration: (On-Device)

Runtime(s): AI Inference Manager (NVAIM) Version 1.0.0

Toolkit: NVAIM

See this document for details on how to integrate the model into NVAIM.

Supported Hardware Platform(s): GPU supporting DirectX 11/12 and Vulkan 1.2 or higher

[Preferred/Supported] Operating System(s):

  • Windows

Software Integration: (Cloud)

Toolkit: NVIDIA NIM

See this document for details on how to integrate the model into NVAIM.

[Preferred/Supported] Operating System(s):

  • Linux

Model Version(s)

Nemotron-4-Mini-Hindi-4B-Instruct

Training & Evaluation Datasets:

Training Dataset:

** Data Collection Method by dataset

  • Hybrid: Automated, Human

** Labeling Method by dataset

  • Hybrid: Automated, Human

Properties (Quantity, Dataset Descriptions, Sensor(s)):

Trained of general Supervised Fine-Tuning (SFT) data followed by DPO on general and translated corpus.

Evaluation Dataset:

** Data Collection Method by dataset

  • Hybrid: Automated, Human

** Labeling Method by dataset

  • Human

Properties (Quantity, Dataset Descriptions, Sensor(s)):

12 benchmark datasets including IndicXtreme and other translated English Benchmarks like MMLU and Hellaswag.

Inference:

Engine: TRT-LLM

Test Hardware [Name the specific test hardware model]:

  • A100
  • A10g
  • H100
  • L40s

Supported Hardware Platform(s): A10g, A100, L40s, H100

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns here.