Creates a model response for the given chat conversation.

Given a list of messages comprising a conversation, the model will return a response. Compatible with OpenAI. See https://platform.openai.com/docs/api-reference/chat/create

Log in to see full request history
Body Params
string
Defaults to microsoft/phi-3.5-moe-instruct
messages
array of objects
required

A list of messages comprising the conversation so far. The roles of the messages must be alternating between user and assistant. The last input message should have role user. A message with the the system role is optional, and must be the very first message if it is present; context is also optional, but must come before a user question.

Messages*
string
required

The role of the message author.

string
required

The contents of the message.

number
≤ 1
Defaults to 0.2

The sampling temperature to use for text generation. The higher the temperature value is, the less deterministic the output text will be. It is not recommended to modify both temperature and top_p in the same call.

number
≤ 1
Defaults to 0.7

The top-p sampling mass used for text generation. The top-p value determines the probability mass that is sampled at sampling time. For example, if top_p = 0.2, only the most likely tokens (summing to 0.2 cumulative probability) will be sampled. It is not recommended to modify both temperature and top_p in the same call.

integer
1 to 4096
Defaults to 1024

The maximum number of tokens to generate in any given call. Note that the model is not aware of this value, and generation will simply stop at the number of tokens specified.

Defaults to null

If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

boolean
Defaults to false

If set, partial message deltas will be sent. Tokens will be sent as data-only server-sent events (SSE) as they become available (JSON responses are prefixed by data: ), with the stream terminated by a data: [DONE] message.

A string or a list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence.

A word or list of words not to use. The words are case sensistive.

Responses

Language
Credentials
Click Try It! to start a request and see the response here! Or choose an example:
application/json
text/event-stream