qwen / qwen3-coder-480b-a35b-instruct

Qwen3-Coder-480B-A35B-Instruct

Model Overview

Description:

Qwen3-Coder-480B-A35B-Instruct is a state-of-the-art large language model specifically designed for code generation and agentic coding tasks. It is a mixture-of-experts (MoE) model with 480B total parameters and 35B activated parameters, featuring native support for 262,144 tokens context length and extendable up to 1M tokens using YaRN.

This model demonstrates significant performance among open models on Agentic Coding, Agentic Browser-Use, and other foundational coding tasks, achieving results comparable to Claude Sonnet. It supports function calling and tool choice capabilities, making it ideal for complex coding workflows and agentic applications.

This model is ready for commercial use.

License/Terms of Use

GOVERNING TERMS: This trial service is governed by the NVIDIA API Trial Terms of Service. Use of this model is governed by the NVIDIA Community Model License. Additional Information: Apache 2.0.

Deployment Geography

Deployment Geography: Global

Use Cases

  • Code Generation: Generate high-quality code from natural language descriptions
  • Agentic Coding: Execute complex coding workflows with function calling
  • Repository Understanding: Process large codebases with long-context capabilities
  • Tool Integration: Interface with development tools and APIs
  • Code Review and Analysis: Analyze and improve existing code
  • Documentation Generation: Create code documentation and comments
  • Browser Automation: Agentic browser-use scenarios
  • Function Calling: Structured tool execution and API integration

Release Information

Release Date: 08/22/2025

Build.NVIDIA.com: Available via link

Third-Party Community Consideration

This model is not owned or developed by NVIDIA. This model has been developed by Qwen (Alibaba Cloud). This model has been developed and built to a third-party's requirements for this application and use case; see link to Qwen3-Coder-480B-A35B-Instruct.

References

Model Architecture

Architecture Type: mixture-of-experts (MoE) with Sparse Activation

Network Architecture: Qwen3MoeForCausalLM (Transformer-based decoder-only)

Parameter Count: 480B total parameters with 35B activated parameters

Expert Configuration: 160 experts with 8 activated per forward pass

Attention Mechanism: Grouped Query Attention (GQA) with 96 query heads and 8 KV heads

Number of Layers: 62

Hidden Size: 6144

Head Dimension: 128

Intermediate Size: 8192

MoE Intermediate Size: 2560

Context Length: 262,144 tokens (native), extendable to 1M with YaRN

Vocabulary Size: 151,936

Input

Input Type(s): Text, Code, Function calls

Input Format(s): Natural language prompts, code snippets, structured function calls

Input Parameters:

  • Max input length: 262,144 tokens (native), up to 1M with YaRN
  • Support for function calling format
  • Tool choice enabled
  • Trust remote code execution
  • Custom tool call parser (qwen3_coder)

Output

Output Type(s): Text, Code, Function responses

Output Format(s): Natural language responses, code generation, structured function outputs

Output Parameters: One-Dimensional (1D)

  • Max output length: Configurable based on remaining context

  • Function call responses in structured format

    Other Properties Related to Output:

  • Non-thinking mode (no <think></think> blocks)

  • Auto tool choice responses

Software Integration

Runtime Engine: vLLM, Transformers (4.51.0+)

Supported Hardware Platform(s): NVIDIA Hopper

Supported Operating System(s): Linux

Data Type: FP8

Data Modality: Text

Model Version: v1.0

Training, Testing, and Evaluation Datasets

Training Dataset

  • Data Collection Method by dataset: The model was trained on a diverse dataset including code repositories, documentation, and natural language text related to programming
  • Labeling Method by dataset: Supervised fine-tuning with instruction-following data
  • Properties: Multi-language code support, instruction-following capabilities, function calling training

Testing Dataset

  • Data Collection Method by dataset: Standard benchmarks for code generation and agentic tasks
  • Labeling Method by dataset: Automated evaluation metrics
  • Properties: HumanEval, MBPP, Agentic coding benchmarks

Evaluation Dataset

  • Data Collection Method by dataset: Public benchmarks and custom evaluation sets
  • Labeling Method by dataset: Automated metrics and human evaluation
  • Properties: Code generation quality, function calling accuracy, agentic task performance

Benchmark Results

The model achieves significant performance among open models on:

  • Agentic Coding tasks
  • Agentic Browser-Use scenarios
  • Foundational coding benchmarks
  • Results comparable to Claude Sonnet on various coding tasks

Inference

Acceleration Engine: vLLM

Test Hardware: NVIDIA Hopper

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report security vulnerabilities or NVIDIA AI Concerns here.

country_code