deepseek-ai / deepseek-r1-distill-qwen-14b

Model Overview

Background

DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. DeepSeek-R1 sought to address these issues and further enhance reasoning performance by incorporating cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.

Description

DeepSeek-R1-Distill-Qwen-14B is a distilled version of the DeepSeek-R1 series, built upon the Qwen2.5-14B architecture. This model is designed to deliver efficient performance for reasoning, math, and code tasks while maintaining high accuracy. By distilling knowledge from the larger DeepSeek-R1 model, it provides state-of-the-art performance with reduced computational requirements.

This model is ready for commercial use. For more details, visit the DeepSeek website.

Third-Party Community Consideration

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to the DeepSeek-R1-Distill-Qwen-14B Model Card.

License/Terms of Use

Governing NVIDIA Download Terms:The NIM container is governed by the NVIDIA Software License Agreement and Product-Specific Terms for AI Products; and the use of this model is governed by the NVIDIA Community Model License. ADDITIONAL INFORMATION: MIT License and Apache 2.0 License.

Model Developer
DeepSeek AI

Model Architecture

Architecture Type: Distilled version of Mixture of Experts (MoE)
Network Architecture: Qwen
Version: 2.5

Input

Input Type: Text
Input Format: String
Input Parameters: 1D
Other Properties Related to Input:
DeepSeek recommends adhering to the following configurations when utilizing the DeepSeek-R1 series models, including benchmarking, to achieve the expected performance:

  1. Set the temperature within the range of 0.5-0.7 (0.6 is recommended) to prevent endless repetitions or incoherent outputs.
  2. Avoid adding a system prompt; all instructions should be contained within the user prompt.
  3. For mathematical problems, it is advisable to include a directive in your prompt such as: "Please reason step by step, and put your final answer within \boxed{}."
  4. When evaluating model performance, it is recommended to conduct multiple tests and average the results.

Additionally, the DeepSeek-R1 series models tend to bypass thinking patterns (i.e., outputting "<think>\n\n</think>") when responding to certain queries, which can adversely affect the model's performance. To ensure that the model engages in thorough reasoning, DeepSeek recommends enforcing the model to initiate its response with "<think>\n" at the beginning of every output.

Output

Output Type: Text
Output Format: String
Output Parameters: 1D

Software Integration

Runtime Engine: TensorRT-LLM
Supported Hardware Microarchitecture Compatibility: NVIDIA Hopper, NVIDIA Lovelace
Preferred/Supported Operating System(s): Linux

Training Dataset

Data Collection Method by dataset: Automated
Labelling Method by dataset: Automated
Properties: 800k samples curated with DeepSeek-R1

Testing Dataset

Data Collection Method by dataset: Automated. Reasoning data generated by DeepSeek-R1.
Labelling Method by dataset: Automated

Evaluation Dataset

Please see the Evaluation section of the DeepSeek-R1-Distill-Qwen-14B Model Card for more information.
Data Collection Method by dataset: Hybrid: Human, Automated

Labeling Method by dataset: Hybrid: Human, Automated

Inference

Engine: TensorRT-LLM
Test Hardware: H20, L20

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report security vulnerabilities or NVIDIA AI Concerns here.

Model Limitations

The base model was trained on data that contains toxic language and societal biases originally crawled from the internet. Therefore, the model may amplify those biases and return toxic responses especially when prompted with toxic prompts. The model may generate answers that may be inaccurate, omit key information, or include irrelevant or redundant text producing socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive.

You are responsible for ensuring that your use of NVIDIA AI Foundation Models complies with all applicable laws.