Prerequisites

Prior to downloading a NIM for self-hosting on our own infrastructure, ensure the following hardware and software prerequisites are met:

HARDWARE

NVIDIA GPU(s)

NVIDIA NIMs runs on any NVIDIA GPU with sufficient memory, but some model/GPU configurations are optimized. Homogenous multi-GPU systems with tensor parallelism enabled are also supported.

❗️Important - Some models may not be able to be loaded if a GPU does not have enough memory or a high enough compute capability rating. See the Support Matrix for more information.

CPU

x86_64 architecture**.

OS

Any Linux distro which:

Is supported by the NVIDIA Container Toolkit
Have glibc >= 2.35 (see output of ld -v)

SOFTWARE

STEP 1 Install CUDA Drivers

We recommend the following:

Use a network repository as part of a package manager installation, skipping the Cuda toolkit installation as the libraries will be available within the NIM container.
Install the CUDA Driver for a specific version.
Then install open kernels (version>550) for a specific driver version.

STEP 2 Install Docker Engine

You can authenticate that docker is working by running the command below:

docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

📒 NOTE - NIMs are built on container technology and require some kind of container runtime. One of the most popular container runtimes is Docker, which we use for a Quick Start Guide.

❗️Important- Please follow the post-installation steps for Docker Engine to manage Docker as a non-root user.

STEP 3 Install NVIDIA Container Toolkit

📒 _NOTE - NIMs leverage NVIDIA GPUs for accelerated execution, so we also need to ensure that our container runtime allows workloads to access any available GPUs. The NVIDIA Container Toolkit enables this access for docker containers.

STEP 4 Configure Docker

After installing the toolkit, follow the instructions within the Configure Docker section in the NVIDIA Container Toolkit documentation.

STEP 5 NGC API Key

STEP 6 Export API Key

Your key will need to be passed to docker run as an NGC_API_KEY environment variable to download the appropriate models and resources when starting the NIM. If you’re not familiar with how to do this, the simplest way is to export it in your terminal:

export NGC_API_KEY=<value>

Run one of the following to make it available at startup:

If using bash

echo "export NGC_API_KEY=<value>" >> ~/.bashrc

If using zsh

echo "export NGC_API_KEY=<value>" >> ~/.zshrc

STEP 7 Docker login to NGC

To pull the NIM container image from NGC, first authenticate with the NVIDIA Container Registry with the following command:

echo "$NGC_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin

Use $oauthtoken as the username and NGC_API_KEY as the password. The $oauthtoken username is a special name that indicates that you will authenticate with an API key and not a user name and password.