Downloadable NIMs

Models from NVIDIA’s API catalog can be downloaded for self-hosting with NVIDIA NIM, giving enterprise developers ownership of their customizations, infrastructure choices, and full control of their IP and AI application. NIMs are distributed as NGC container images through the AI Enterprise NVIDIA NGC Catalog and are available using a trial license of NVIDIA AI Enterprise or an NVIDIA Developer Account.

📘

Need help?

Having trouble getting started with downloadable NIMs reference our quick start guide

The following additional OpenAPI spec details the endpoints for self-hosted LLM NIMs:

  • /v1/health/ready - This endpoint will return a 200 status when the service is ready to receive inference requests.
  • /v1/version - The release attribute corresponds to the product release version of the NIM. The api attribute is the API version running inside the NIM.
  • /v1/models - list of models available for inference. When the NIM is set up to serve customizations (e.g. LoRAs) this will also return the customizations available as models.
  • /v1/chat/completions - Chat Completions Endpoint
  • /v1/completions - Completions Endpoint

NOTE: /v1/completions and /v1/chat/completions endpoints can be found in the LLM Model OpenAPI Schema.


What’s Next