stockmark / stockmark-2-100b-instruct

Stockmark-2-100B-Instruct

Description

Stockmark-2-100B-Instruct is a 100-billion-parameter large language model built from scratch, with a particular focus on Japanese. It was pre-trained on approximately 2.0 trillion tokens of data, consisting of 60% English, 30% Japanese, and 10% code. Following pretraining, the model underwent post-training (SFT and DPO) with synthetic data in Japanese to enhance its ability to follow instructions. This version improves instruction-following ability and adds support for long-context (32k), compared to the previous version.

This model is ready for commercial and non-commercial use

Third-Party Community Consideration:

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party's requirements for this application and use case; see link to Non-NVIDIA
Stockmark-2-100B-Instruct Model Card.

License and Terms of Use:

GOVERNING TERMS: The trial service is governed by the NVIDIA API Trial Terms of Service. Use of this model is governed by the NVIDIA Open Model License Agreement. Additional Information: MIT License.

Deployment Geography:

Deployment Geography: Global

Use Case:

Use Case: Japanese and English language processing, instruction following, long-context understanding, research and commercial applications.

Release Date:

build.nvidia.com 09/24/2025 via link

Huggingface 09/24/2025 via link

Reference(s):

Stockmark Inc.

GENIAC

Model Architecture:

Architecture Type: Causal Language Model

Network Architecture: Transformer-based with Grouped Query Attention (GQA)

Total Parameters: 96B

Active Parameters: 96B

Vocabulary Size: 100352

Input:

Input Types: Text

Input Parameters: [One-Dimensional (1D)]

Other Input Properties: Supports Japanese and English languages

Input Context Length (ISL): 32,000 tokens

Output:

Output Type: Text

Output Parameters: [One-Dimensional (1D)]

Other Output Properties: Instruction-following responses in Japanese and English

Output Context Length (OSL): Up to 32,768 tokens (shared with input)

Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.

Software Integration:

Runtime Engines: PyTorch, transformers, vLLM
Supported Hardware:

NVIDIA Ada Lovelace
NVIDIA Ampere
NVIDIA Blackwell
NVIDIA Hopper
Operating Systems: Linux, Windows, macOS

Model Version(s)

Stockmark-2-100B-Instruct

Training, Testing, and Evaluation Datasets:

Training Dataset

Training Data Collection: Synthetic

Training Labeling: Synthetic

Data Modality: Text

Text Training Data Size: 1 Billion to 10 Trillion Tokens

Training Properties: Post-training with SFT and DPO methods

Testing Dataset

Testing Data Collection: Undisclosed

Testing Labeling: Undisclosed

Testing Properties: Undisclosed

Evaluation Dataset

Evaluation Benchmark Score: Japanese MT-bench Average: 7.87

Evaluation Data Collection: Japanese MT-bench evaluation`

Evaluation Labeling: Automated scoring system

Evaluation Properties: Multi-domain evaluation including coding, extraction, humanities, math, reasoning, roleplay, and STEM

Inference

Acceleration Engine: vLLM, TensorRT-LLM

Test Hardware: 4*H100

Additional Information

This project was supported by GENIAC. The model uses Grouped Query Attention (GQA) with 72 query heads and 8 key-value heads. Training libraries include NVIDIA/Megatron-LM for pretraining and huggingface/trl for posttraining.

Limitations

This model should be used responsibly and in accordance with applicable laws and regulations. Users should be aware of potential biases in the training data and outputs, particularly when processing content in Japanese and English languages.

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.