AI
Tokens and Tokenisation
A practical guide to understanding how AI models process text and why tokens matter for product teams.
What tokens are, how tokenisation affects AI behaviour and cost, and what designers and product teams need to know when building AI features.
What it is
Tokens are the units of text that AI language glossaryModelA model is a system or representation used to process data and generate outputs, often trained to perform specific tasks.Open glossary term work with. Rather than processing whole words, models break text down into smaller pieces called tokens before processing it.
A token is roughly equivalent to three or four characters of text. Common words like "the" or "is" are typically a single token. Longer or unusual words may be split into multiple tokens. Spaces, punctuation, and formatting also consume tokens.
Tokenisation is the glossaryProcessA process is a defined sequence of steps used to achieve a specific outcome.Open glossary term of converting text into this sequence of tokens before it is passed to the glossaryModelA model is a system or representation used to process data and generate outputs, often trained to perform specific tasks.Open glossary term.
Every part of an glossaryInteractionInteraction refers to any action a user takes within a product and how the system responds. It includes clicks, taps, gestures, and inputs that drive the user experience.Open glossary term — the glossarySystemA system is a collection of interconnected components that work together to achieve a specific function or outcome.Open glossary term glossaryPromptA prompt is the input or instruction given to an AI system to guide its output or response.Open glossary term, the conversation history, the user message, and the model's response — is counted in tokens. This affects both the cost of using AI and the amount of content that fits within a context window.
Understanding tokens helps you reason about cost, glossaryContextThe surrounding conditions that shape behaviour and decisions.Open glossary term limits, and why AI sometimes handles unusual words or languages differently.
When to use it
Understand when token awareness is practically useful. It matters most when:
It matters less when:
Key takeaway
Tokens are the currency of AI usage. Understanding them helps you manage cost and design features that behave reliably at scale.
How it works
Understand the basic mechanism. Before processing any text, a language glossaryModelA model is a system or representation used to process data and generate outputs, often trained to perform specific tasks.Open glossary term passes it through a tokeniser — a component that splits the text into the token sequences the model understands.
Different glossaryModelA model is a system or representation used to process data and generate outputs, often trained to perform specific tasks.Open glossary term use different tokenisers, so the same text can produce different token counts depending on which model is used.
When AI providers charge for usage, they typically charge per token — both for the input (everything sent to the glossaryModelA model is a system or representation used to process data and generate outputs, often trained to perform specific tasks.Open glossary term) and the output (the glossaryResponseA response is the data or result returned by a server after receiving a request.Open glossary term generated). Understanding this helps you estimate and control costs.
What this means for designers and product teams. Long guideSystem PromptsWhat system prompts do, how they define an AI's role and constraints, and what product and design teams need to know when working with them.Open guide, large documents, and verbose glossaryResponseA response is the data or result returned by a server after receiving a request.Open glossary term all cost more and consume more of the glossaryContextThe surrounding conditions that shape behaviour and decisions.Open glossary term window. Concise, well-structured prompts are both cheaper and more effective.
Multilingual content tokenises differently. Languages with more complex character sets, such as Chinese or Arabic, often produce more tokens per word than English, which affects both cost and glossaryContextThe surrounding conditions that shape behaviour and decisions.Open glossary term usage.
What to look for
Focus on:
Where it goes wrong
Most issues come from: Token costs at small scale look trivial — at production scale, they are not.
What you get from it
Understanding tokens gives you:
Key takeaway
Tokens are not just a technical detail — they are a cost and design constraint that should be factored in from the start.
FAQ
Common questions
A few practical answers to the questions that usually come up around this method.
What is a token in AI?
A token is the basic unit of text that an glossaryModelA model is a system or representation used to process data and generate outputs, often trained to perform specific tasks.Open glossary term glossaryProcessA process is a defined sequence of steps used to achieve a specific outcome.Open glossary term. Rather than reading whole words, models work with tokens — roughly three or four characters each. A sentence of ten words might contain fifteen to twenty tokens depending on the words used.
Why do AI companies charge per token?
Because processing tokens is the main computational cost of running a language glossaryModelA model is a system or representation used to process data and generate outputs, often trained to perform specific tasks.Open glossary term. The more tokens in an input and output, the more computation required. Charging per token is a direct proxy for the cost of generating a glossaryResponseA response is the data or result returned by a server after receiving a request.Open glossary term.
How many tokens is a typical page of text?
A standard page of English text contains roughly 500 words, which translates to approximately 600 to 750 tokens. This varies depending on the complexity of the vocabulary, punctuation, and formatting.
Does it cost more to process long conversations?
Yes. Every part of the conversation — including the full history — is passed to the glossaryModelA model is a system or representation used to process data and generate outputs, often trained to perform specific tasks.Open glossary term with each new message. As a conversation grows, the token count for each glossaryInteractionInteraction refers to any action a user takes within a product and how the system responds. It includes clicks, taps, gestures, and inputs that drive the user experience.Open glossary term increases, meaning each subsequent message costs more to glossaryProcessA process is a defined sequence of steps used to achieve a specific outcome.Open glossary term than the one before it.
Can I control how many tokens a model uses?
To a degree. You can set maximum output length limits to prevent the glossaryModelA model is a system or representation used to process data and generate outputs, often trained to perform specific tasks.Open glossary term generating unnecessarily long glossaryResponseA response is the data or result returned by a server after receiving a request.Open glossary term. You can also write concise glossaryPromptA prompt is the input or instruction given to an AI system to guide its output or response.Open glossary term and manage what context is included. Beyond that, the model determines how many tokens it needs to generate a response.
Quick take
Understanding tokens helps you understand AI costs, context limits, and occasionally why AI behaves unexpectedly with certain inputs.
Related Services