Create Completion

Completion API similar to OpenAI's API.

See https://platform.openai.com/docs/api-reference/completions/create
for the API specification. This API mimics the OpenAI Completion API.

Body Params
string
Defaults to nvidia/mistral-nemo-minitron-8b-base

Name of target model.

User prompt.

1 to 1024

Maximum number of tokens to generate.

0 to 2

Control randomness by applying a scaling to the logits; a higher value has increased variety and a lower values makes sampling less diverse.

0 to 1

Also know as nucleus sampling - the cumulative probability cutoff for token selection. Using a lower value means sampling from a smaller set of candidates. This is done by sorting the logprobs and collecting them one by one until their cumulative sum exceed the top_p score.

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

-2 to 2

Indicates how much to penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. https://platform.openai.com/docs/api-reference/completions/create for details. Higher values increase the penalty.

-2 to 2

Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

0 to 18446744073709552000

The model generates random results. Changing the input seed alone will produce a different response with similar characteristics. It is possible to reproduce results by fixing the input seed (assuming all other hyperparameters are also fixed).

Responses

Language
Credentials
Bearer
Response
Click Try It! to start a request and see the response here! Or choose an example:
application/json
country_code