Skip to main content

Max Tokens

Max tokens is a parameter that caps the maximum length of the AI output in tokens. It prevents overly long answers and controls cost and latency.

With the max tokens parameter you set how many tokens the model may generate at most before stopping. When the limit is reached, the output ends, possibly mid-sentence. The value must fit the context window, since input and output together must not exceed its limit. A sensible cap saves cost and speeds up answers but should be large enough for complete results.

Related terms

From term to practice

Save, version, and share your best prompts with Prompt2Love.

Get started free