Tokenization breaks text into small building blocks: common words become a single token, while rare or long words split into several. As a rule of thumb, 1,000 tokens in English equal about 750 words. Both the input (prompt) and the output count toward the tokens, and API costs are billed per token. A model's token budget is limited by its context window.
Token
A token is the smallest unit of text an AI model processes, usually a word fragment of about four characters. Models count, bill, and limit text in tokens rather than words.
