nvidia / corrdiff

Model Overview

Description

Corrector Diffusion (CorrDiff) US GEFS-HRRR model down-scales several surface and atmospheric variables from 25-km resolution
forecast data from the Global Ensemble Forecast System (GEFS) and predicts 3-km
resolution High-Resolution Rapid Refresh (HRRR) data. CorrDiff US allows the prediction of high-fidelity stochastic weather phenomena over
the CONUS from low-fidelity input data that would otherwise require expensive regional
numerical simulations.

CorrDiff is a generative downscaling model trained over the contiguous United States (CONUS).

This model is ready for commercial use.

Reference(s)

Model Architecture

Architecture Type: Diffusion

Network Architecture: Patch-Based Corrector Diffusion

Input

Input Type(s):

  • Tensor (38 Surface & Atmospheric Variables + Forecast Lead Time)
  • Input data forecast lead time in hours

Input Format(s): NumPy

Input Parameters:

  • Four Dimensional (4D) (batch, variable, latitude, longitude)
  • Integer (Lead Time in Hours)

Other Properties Related to Input:

  • 0.25 degree latitude-longitude grid bounded over CONUS
  • Input resolution: [129, 301]
  • Lattitude Coordinates: [53, 52.75, 52.5, ..., 21.5, 21.25, 21]
  • Longitude Coordinates: [225, 225.25, 225.5, ..., 299.5, 299.75, 300]
  • Input weather variables: "u10m", "v10m", "t2m", "r2m", "sp", "msl","tcwv", "u1000", "u925", "u850", "u700", "u500", "u250", "v1000", "v925", "v850", "v700", "v500", "v250", "z1000", "z925", "z850", "z700", "z500", "z200", "t1000", "t925", "t850", "t700", "t500", "t100", "r1000", "r925", "r850", "r700", "r500", "r100"

Output

Output Type(s): Tensor (8 Surface & Atmospheric Variables)

Output Format: NumPy

Output Parameters: 5D (batch, samples, variable, latitude, longitude)

Other Properties Related to Output:

  • 3-km lambert-conformal projection over CONUS of resolution
  • Output resolution: [1056, 1792]
  • Output weather variables: "u10m", "v10m", "t2m", "tp", "csnow", "cicep", "cfrzr", "crain"

The output is on a cropped window of the grid used by HRRR.
Refer to the HRRR documentation for additional information on this grid.
The output coordinates can be obtained from the corrdiff_output_lat.npy and corrdiff_output_lon.npy
files in the model package.

Software Integration

Runtime Engine(s): Not Applicable

Supported Hardware Microarchitecture Compatibility:

  • NVIDIA Ampere
  • NVIDIA Hopper
  • NVIDIA Turing

Supported Operating System(s):

  • Linux

Model Version(s)

Model version: v1

Training, Testing, and Evaluation Datasets:

Training Dataset

Link: GEFS

Data Collection Method by dataset

  • Automatic/Sensors

Labeling Method by dataset

  • Automatic/Sensors

Properties (Quantity, Dataset Descriptions, Sensor(s)):
GEFS data for the date range of 2020/12/02 to 2023/12/31. The Global Ensemble Forecast System (GEFS) is a weather model created by the National Centers for Environmental Prediction (NCEP) that generates 21 separate forecasts (ensemble members) to address underlying uncertainties in the input data such limited coverage, instruments or observing systems biases, and the limitations of the model itself.

Link: HRRR

Data Collection Method by dataset

  • Automatic/Sensors

Labeling Method by dataset

  • Automatic/Sensors

Properties (Quantity, Dataset Descriptions, Sensor(s)):
HRRR data for the date range of 2020/12/02 to 2023/12/31. The HRRR is a NOAA real-time 3-km resolution, hourly updated, cloud-resolving, convection-allowing atmospheric model, initialized by 3km grids with 3km radar assimilation.

Evaluation Dataset

Link: GEFS

Data Collection Method by dataset

  • Automatic/Sensors

Labeling Method by dataset

  • Automatic/Sensors

Properties (Quantity, Dataset Descriptions, Sensor(s)):
GEFS data for the date range of 2024/01/01 to 2024/07/31. The Global Ensemble Forecast System (GEFS) is a weather model created by the National Centers for Environmental Prediction (NCEP) that generates 21 separate forecasts (ensemble members) to address underlying uncertainties in the input data such limited coverage, instruments or observing systems biases, and the limitations of the model itself.

Link: HRRR

Data Collection Method by dataset

  • Automatic/Sensors

Labeling Method by dataset

  • Automatic/Sensors

Properties (Quantity, Dataset Descriptions, Sensor(s)):
HRRR data for the date range of 2024/01/01 to 2024/07/31. The HRRR is a NOAA real-time 3-km resolution, hourly updated, cloud-resolving, convection-allowing atmospheric model, initialized by 3km grids with 3km radar assimilation.

Inference:

Engine: Triton

Test Hardware:

  • A100
  • H100
  • L40S
  • RTX6000

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established
policies and practices to enable development for a wide array of AI applications.
When downloaded or used in accordance with our terms of service, developers should work
with their supporting model team to ensure this model meets requirements for the
relevant industry and use case and addresses unforeseen product misuse.
For more detailed information on ethical considerations for this model, please see the
Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards here.
Please report security vulnerabilities or NVIDIA AI Concerns here.