Stockmark-2-100B-Instruct
Description
Stockmark-2-100B-Instruct is a 100-billion-parameter large language model built from scratch, with a particular focus on Japanese. It was pre-trained on approximately 2.0 trillion tokens of data, consisting of 60% English, 30% Japanese, and 10% code. Following pretraining, the model underwent post-training (SFT and DPO) with synthetic data in Japanese to enhance its ability to follow instructions. This version improves instruction-following ability and adds support for long-context (32k), compared to the previous version.
This model is ready for commercial and non-commercial use
Third-Party Community Consideration:
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party's requirements for this application and use case; see link to Non-NVIDIA
Stockmark-2-100B-Instruct Model Card.
License and Terms of Use:
GOVERNING TERMS: The trial service is governed by the NVIDIA API Trial Terms of Service. Use of this model is governed by the NVIDIA Open Model License Agreement. Additional Information: MIT License.
Deployment Geography:
Deployment Geography: Global
Use Case:
Use Case: Japanese and English language processing, instruction following, long-context understanding, research and commercial applications.
Release Date:
build.nvidia.com 09/24/2025 via link
Huggingface 09/24/2025 via link
Reference(s):
Model Architecture:
Architecture Type: Causal Language Model
Network Architecture: Transformer-based with Grouped Query Attention (GQA)
Total Parameters: 96B
Active Parameters: 96B
Vocabulary Size: 100352
Input:
Input Types: Text
Input Parameters: [One-Dimensional (1D)]
Other Input Properties: Supports Japanese and English languages
Input Context Length (ISL): 32,000 tokens
Output:
Output Type: Text
Output Parameters: [One-Dimensional (1D)]
Other Output Properties: Instruction-following responses in Japanese and English
Output Context Length (OSL): Up to 32,768 tokens (shared with input)
Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
Software Integration:
Runtime Engines: PyTorch, transformers, vLLM
Supported Hardware:
- NVIDIA Ada Lovelace
- NVIDIA Ampere
- NVIDIA Blackwell
- NVIDIA Hopper
Operating Systems: Linux, Windows, macOS
Model Version(s)
Stockmark-2-100B-Instruct
Training, Testing, and Evaluation Datasets:
Training Dataset
Training Data Collection: Synthetic
Training Labeling: Synthetic
Data Modality: Text
Text Training Data Size: 1 Billion to 10 Trillion Tokens
Training Properties: Post-training with SFT and DPO methods
Testing Dataset
Testing Data Collection: Undisclosed
Testing Labeling: Undisclosed
Testing Properties: Undisclosed
Evaluation Dataset
Evaluation Benchmark Score: Japanese MT-bench Average: 7.87
Evaluation Data Collection: Japanese MT-bench evaluation`
Evaluation Labeling: Automated scoring system
Evaluation Properties: Multi-domain evaluation including coding, extraction, humanities, math, reasoning, roleplay, and STEM
Inference
Acceleration Engine: vLLM, TensorRT-LLM
Test Hardware: 4*H100
Additional Information
This project was supported by GENIAC. The model uses Grouped Query Attention (GQA) with 72 query heads and 8 key-value heads. Training libraries include NVIDIA/Megatron-LM for pretraining and huggingface/trl for posttraining.
Limitations
This model should be used responsibly and in accordance with applicable laws and regulations. Users should be aware of potential biases in the training data and outputs, particularly when processing content in Japanese and English languages.
Ethical Considerations
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.