Creates a model response for the given chat conversation.

Given a list of messages comprising a conversation, the model will return a response. Compatible with OpenAI. See https://platform.openai.com/docs/api-reference/chat/create

Recent Requests
Log in to see full request history
TimeStatusUser Agent
Retrieving recent requests…
LoadingLoading…
Body Params
string
Defaults to nvidia/usdcode-llama-3.1-70b-instruct
integer
1 to 2048
Defaults to 1024

The maximum number of tokens to generate in any given call. Note that the model is not aware of this value, and generation will simply stop at the number of tokens specified.

boolean
Defaults to false

If set, partial message deltas will be sent. Tokens will be sent as data-only server-sent events (SSE) as they become available (JSON responses are prefixed by data: ), with the stream terminated by a data: [DONE] message.

number
0 to 1
Defaults to 0.1

The sampling temperature to use for text generation. The higher the temperature value is, the less deterministic the output text will be. It is not recommended to modify both temperature and top_p in the same call.

number
≤ 1
Defaults to 1

The top-p sampling mass used for text generation. The top-p value determines the probability mass that is sampled at sampling time. For example, if top_p = 0.2, only the most likely tokens (summing to 0.2 cumulative probability) will be sampled. It is not recommended to modify both temperature and top_p in the same call.

enum
Defaults to auto

The type of expert to use for the completion. Users can choose a value among knowledge, code, helperfunction, and auto. When knowledge is passed, the model will answer with USD knowledge expert. When code is selected, the model will respond with vanilla OpenUSD code. If helperfunction is chosen, it will use high-level helper functions to produce the code response. When auto is set, the LLM will determine which expert type to use. If not specified, the default expert for the model will be used.

Allowed:
required

A list of messages comprising the conversation so far.

Responses

Language
Credentials
Bearer
LoadingLoading…
Response
Click Try It! to start a request and see the response here! Or choose an example:
application/json
country_code