Creates a model response for the given chat conversation.

Given a list of messages comprising a conversation, the model will return a response. Compatible with OpenAI. See https://platform.openai.com/docs/api-reference/chat/create

Body Params
string
Defaults to nvidia/llama-3.1-nemotron-safety-guard-8b-v3
messages
array of objects
required

A list of messages comprising the conversation so far. The roles of the messages must be alternating between user and assistant. The last input message should have role user. A message with the the system role is optional, and must be the very first message if it is present; context is also optional, but must come before a user question.

Messages*
number
0 to 1
Defaults to 0

The sampling temperature to use for text generation. The higher the temperature value is, the less deterministic the output text will be.

boolean
Defaults to false

If set, partial message deltas will be sent. Tokens will be sent as data-only server-sent events (SSE) as they become available (JSON responses are prefixed by data: ), with the stream terminated by a data: [DONE] message.

Headers
string
enum
Defaults to application/json

Generated from available response content types

Allowed:
Responses

Language
Credentials
Bearer
Response
Click Try It! to start a request and see the response here! Or choose an example:
application/json
text/event-stream
country_code