Qwen3-Coder-480B-A35B-Instruct
Model Overview
Description:
Qwen3-Coder-480B-A35B-Instruct is a state-of-the-art large language model specifically designed for code generation and agentic coding tasks. It is a mixture-of-experts (MoE) model with 480B total parameters and 35B activated parameters, featuring native support for 262,144 tokens context length and extendable up to 1M tokens using YaRN.
This model demonstrates significant performance among open models on Agentic Coding, Agentic Browser-Use, and other foundational coding tasks, achieving results comparable to Claude Sonnet. It supports function calling and tool choice capabilities, making it ideal for complex coding workflows and agentic applications.
This model is ready for commercial use.
License/Terms of Use
GOVERNING TERMS: This trial service is governed by the NVIDIA API Trial Terms of Service. Use of this model is governed by the NVIDIA Community Model License. Additional Information: Apache 2.0.
Deployment Geography
Deployment Geography: Global
Use Cases
- Code Generation: Generate high-quality code from natural language descriptions
- Agentic Coding: Execute complex coding workflows with function calling
- Repository Understanding: Process large codebases with long-context capabilities
- Tool Integration: Interface with development tools and APIs
- Code Review and Analysis: Analyze and improve existing code
- Documentation Generation: Create code documentation and comments
- Browser Automation: Agentic browser-use scenarios
- Function Calling: Structured tool execution and API integration
Release Information
Release Date: 08/22/2025
Build.NVIDIA.com: Available via link
Third-Party Community Consideration
This model is not owned or developed by NVIDIA. This model has been developed by Qwen (Alibaba Cloud). This model has been developed and built to a third-party's requirements for this application and use case; see link to Qwen3-Coder-480B-A35B-Instruct.
References
- Qwen3-Coder: A Large Language Model for Code Generation
- Qwen3-Coder GitHub Repository
- Qwen Documentation
- Hugging Face Model Page
- Qwen3 Technical Report (arXiv:2505.09388)
Model Architecture
Architecture Type: mixture-of-experts (MoE) with Sparse Activation
Network Architecture: Qwen3MoeForCausalLM (Transformer-based decoder-only)
Parameter Count: 480B total parameters with 35B activated parameters
Expert Configuration: 160 experts with 8 activated per forward pass
Attention Mechanism: Grouped Query Attention (GQA) with 96 query heads and 8 KV heads
Number of Layers: 62
Hidden Size: 6144
Head Dimension: 128
Intermediate Size: 8192
MoE Intermediate Size: 2560
Context Length: 262,144 tokens (native), extendable to 1M with YaRN
Vocabulary Size: 151,936
Input
Input Type(s): Text, Code, Function calls
Input Format(s): Natural language prompts, code snippets, structured function calls
Input Parameters:
- Max input length: 262,144 tokens (native), up to 1M with YaRN
- Support for function calling format
- Tool choice enabled
- Trust remote code execution
- Custom tool call parser (qwen3_coder)
Output
Output Type(s): Text, Code, Function responses
Output Format(s): Natural language responses, code generation, structured function outputs
Output Parameters: One-Dimensional (1D)
-
Max output length: Configurable based on remaining context
-
Function call responses in structured format
Other Properties Related to Output:
-
Non-thinking mode (no
<think></think>
blocks) -
Auto tool choice responses
Software Integration
Runtime Engine: vLLM, Transformers (4.51.0+)
Supported Hardware Platform(s): NVIDIA Hopper
Supported Operating System(s): Linux
Data Type: FP8
Data Modality: Text
Model Version: v1.0
Training, Testing, and Evaluation Datasets
Training Dataset
- Data Collection Method by dataset: The model was trained on a diverse dataset including code repositories, documentation, and natural language text related to programming
- Labeling Method by dataset: Supervised fine-tuning with instruction-following data
- Properties: Multi-language code support, instruction-following capabilities, function calling training
Testing Dataset
- Data Collection Method by dataset: Standard benchmarks for code generation and agentic tasks
- Labeling Method by dataset: Automated evaluation metrics
- Properties: HumanEval, MBPP, Agentic coding benchmarks
Evaluation Dataset
- Data Collection Method by dataset: Public benchmarks and custom evaluation sets
- Labeling Method by dataset: Automated metrics and human evaluation
- Properties: Code generation quality, function calling accuracy, agentic task performance
Benchmark Results
The model achieves significant performance among open models on:
- Agentic Coding tasks
- Agentic Browser-Use scenarios
- Foundational coding benchmarks
- Results comparable to Claude Sonnet on various coding tasks
Inference
Acceleration Engine: vLLM
Test Hardware: NVIDIA Hopper
Ethical Considerations
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report security vulnerabilities or NVIDIA AI Concerns here.