Creates an embedding vector from the input text.

Body Params
required

Input text to embed. Max length is 32k tokens.

string
required

ID of the embedding model.

string
enum

nvidia/llama-3.2-nemoretriever-300m-embed-v2 operates in passage or query mode, and thus require the input_type parameter. passage is used when generating embeddings during indexing. query is used when generating embeddings during querying. It is very important to use the correct input_type. Failure to do so will result in large drops in retrieval accuracy.

Allowed:
string
enum
Defaults to float

The format to return the embeddings in.

Allowed:
string
enum
Defaults to NONE

Specifies how inputs longer than the maximum token length of the model are handled. Passing START discards the start of the input. END discards the end of the input. In both cases, input is discarded until the remaining input is exactly the maximum input token length for the model. If NONE is selected, when the input exceeds the maximum input token length an error will be returned.

Allowed:
string

Not implemented, but provided for API compliance. This field is ignored.

Responses

Language
Credentials
Bearer
Response
Click Try It! to start a request and see the response here! Or choose an example:
application/json
country_code