Token Limit
Overview:
A Token Limit defines the maximum amount of text an AI model can process within a single request. This includes both the input (prompt, instructions, conversation history) and the output (model response). Token limits determine how much context the model can consider at once and directly affect performance, cost, and usability.
Token Limit Setup
A Token is the smallest unit of information an AI model processes. Tokens are used to interpret inputs and generate outputs across different modalities such as text, speech, and images. Token usage helps determine how much content the model can handle within a single interaction.
Token consumption is calculated after the system completes the following actions:
Chat Response Generation: Tokens used by the AI to understand the prompt and produce a text reply.
Speech‑to‑Text Processing: Tokens generated when audio input is converted into text.
Text‑to‑Speech Output: Tokens consumed when the system converts text into spoken audio.
Conversation Title Generation: Tokens used when automatically creating a short, descriptive title for the conversation.
Image Generation: Tokens required when creating images based on user prompts.
Why Token Limits Matter
Token limits define how much information an AI system can process at once. Understanding these limits helps users get smoother, more accurate, and more efficient interactions. Here’s why token limits are important:
Maintain Consistent and Accurate Responses: When your input stays within the token limit, the AI can fully understand the context and generate complete, relevant answers. This helps avoid missing details or producing cut‑off responses.
Prevent Errors and Interruptions: If a request exceeds the token limit, the system may return errors or automatically remove earlier parts of the conversation. Managing token usage helps keep interactions stable and prevents unexpected behavior.
Improve Overall Quality: Shorter, well‑structured prompts allow the AI to focus on what matters most. This leads to clearer, more accurate, and more useful responses.
Enable Better Application Design: For developers or teams building on top of AI systems, token limits guide decisions like how much conversation history to keep, when to summarize content, and how to structure workflows.
Optimize Usage and Cost: Many AI platforms calculate costs based on tokens used. Staying efficient with tokens reduces unnecessary consumption and helps manage usage more effectively.
How Token Usage and Limits Are Shown in the User Interface
The token limit indicator is displayed in the application header, located on the right side next to the Help and Status icons. It provides users with a quick view of their remaining token allocation.
When users hover over the percentage value, a tooltip appears with additional details explaining their current token usage.
Token usage is refreshed automatically every 5 minutes, and also updated immediately after the following user actions:
Chat response completion
Speech‑to‑text processing
Text‑to‑speech output
Conversation title generation

Balance Threshold
When a user reaches the configured percentage of their token usage, a notification will appear to inform them that they are approaching their token limit.

Token Limit Exceeded
When a user surpasses their allocated token limit, the system displays an error snackbar notification informing them that their token balance has been exhausted. At this point, the user will no longer be able to continue chatting with the assistant until their balance is reset.


Token request process
Prerequisite: Only users with the Team Admin or Team Owner role can request a token for a team member. Alternatively, a Support team member can help users not part of any team.
Steps to raise a token request:
Open the Admin Center.
Navigate to Teams and select the relevant team.
Click the + symbol next to the member who requires an additional token.

Click Request in the popup to submit the request.

You will receive a success message confirming that the request has been submitted successfully!

Last updated
Was this helpful?