StarCoder2 Model Overview

Description:

StarCoder2-15B is a state-of-the-art language model with 15 billion parameters, trained on over 600 programming languages using The Stack v2 dataset. It employs advanced techniques like Grouped Query Attention and sliding window attention to enhance its performance on coding tasks. The model is optimized to handle a context window of 16,384 tokens and was trained using the Fill-in-the-Middle objective on 4+ trillion tokens. Trained with NVIDIA's NeMo™ Framework on the NVIDIA Eos Supercomputer, it represents a significant advancement in code generation and understanding.

Terms of Use

GOVERNING TERMS: Your use of this model is governed by the BigCode OpenRAIL-M v1 License Agreement.

References(s):

StarCoder2-15B on Hugging Face

Model Architecture:

Architecture Type: Transformer decoder

Network Architecture: Grouped Query Attention, sliding window attention

Model Version: 2.0

Input:

Input Format: Text

Input Parameters: Temperature, Top P, Max Output Tokens

Output:

Output Format: Text

Output Parameters: None

Software Integration:

Supported Hardware Platform(s): NVIDIA H100 GPUs

Supported Operating System(s): Linux

Inference:

Engine: Triton Inference Server

Test Hardware: NVIDIA DGX H100 systems