Retrieval APIs

Overview

NeMo Retrieval NIM API endpoints provide easy access to models to perform semantic search of enterprise data and deliver highly precise answers. Developers use these APIs, which are organized as a collection of NIMs, to create robust copilots, chatbots, and AI assistants from start to finish. NeMo Retriever NIMs enhance text question-answering retrieval and increase accuracy by reranking possible candidates.

Models

baai

nvidia

ModelEndpoint
nvidia / embed-qa-4Create embedding vector (embed-qa-4)
nvidia / llama-3.2-nemoretriever-300m-embed-v1Creates an embedding vector from the input text (llama-3.2-nemoretriever-300m-embed-v1)
nvidia / llama-3.2-nemoretriever-300m-embed-v2Creates an embedding vector from the input text (llama-3.2-nemoretriever-300m-embed-v2)
nvidia / llama-3.2-nemoretriever-500m-rerank-v2Rank passages by their relation to a query (llama-3.2-nemoretriever-500m-rerank-v2)
nvidia / llama-3.2-nv-embedqa-1b-v1Creates an embedding vector from the input text (llama-3.2-nv-embedqa-1b-v1)
nvidia / llama-3.2-nv-embedqa-1b-v2Creates an embedding vector from the input text (llama-3.2-nv-embedqa-1b-v2)
nvidia / llama-3.2-nv-rerankqa-1b-v1Rank passages by their relation to a query (llama-3.2-nv-rerankqa-1b-v1)
nvidia / llama-3.2-nv-rerankqa-1b-v2Rank passages by their relation to a query (llama-3.2-nv-rerankqa-1b-v2)
nvidia / llama-nemotron-embed-vl-1b-v2Creates an embedding vector from the input text (llama-nemotron-embed-vl-1b-v2)
nvidia / nvclipCreates an embedding vector representing the input text or image (nvclip)
nvidia / nv-embed-v1Creates an embedding vector from the input text (nv-embed-v1)
nvidia / nv-embedcode-7b-v1Creates an embedding vector from the input text (nv-embedcode-7b-v1)
nvidia / nv-embedqa-e5-v5Creates an embedding vector from the input text (nv-embedqa-e5-v5)
nvidia / nv-rerankqa-mistral-4b-v3Rank passages by their relation to a query (nv-rerankqa-mistral-4b-v3)
nvidia / rerank-qa-mistral-4bCreate ranking (rerank-qa-mistral-4b)

snowflake

country_code