Tokens, credits and the new economics of AI
Seat-based SaaS pricing assumed something simple: more users, more value, more cost. AI broke that assumption. The minute copilots and agents started doing the work, value stopped scaling with logins and started scaling with activity. With tokens, credits, context windows, retry loops, agent chains: every click now has a meter behind it.
Analyst commentary lands in roughly the same place. AI decouples value from headcount and pushes vendors toward usage, agent and outcome-based monetization, usually through hybrid models that protect a little predictability for both sides.
This is part two of our series on AI economics (you can read part one here). Here’s what we’ll cover:
- The core AI consumption metrics vendors actually bill on
- Why enterprise AI bills spike without warning
- How SaaS vendors monetize AI under the hood
- The pricing patterns showing up across the market
- What leaders need to do now to stay in control
AI consumption metrics: the new billing units
AI brings a stack of new billable units, and they behave nothing like seats. Gartner’s framing is useful here: AI agents shift value delivery from access to actions and outcomes, which forces a new pricing logic on vendors and a new buying logic on customers.
Here are the categories you’ll see most often.
Tokens: the base unit of LLM cost
Tokens are the atomic billing unit for most LLM-based services, and the economics are anything but linear. Prompt design, context length and model choice all swing the cost in different directions.
What drives token spend:
- Input and output both meter
- Larger context windows balloon consumption
- Agent systems often fire multiple calls per user action
- Reasoning models and multi-step workflows multiply token use fast
Credits: the AI currency layer
Many SaaS vendors wrap tokens or compute into a credit abstraction. It simplifies packaging, smooths adoption and supports hybrid pricing by blending predictable subscription value with flexible consumption.
Why vendors lean on credits:
- Hides the messy infrastructure math
- Supports bundled allowances with monetized overages
- Buys time while AI value (and AI pricing) stabilizes
The enterprise risk is transparency. Credit expirations, resets and rollover rules can silently create overages or strand value, and finance teams often can’t map a credit back to a workload.
Model execution time: compute minutes and seconds
Some AI capabilities bill on execution time or compute intensity. This is most common in multimodal workloads such as audio, video and embedding pipelines, where cost tracks with how hard the model worked, not how many words it produced.
API calls and workflow chaining
AI-enabled workflows are rarely a single transaction. A “simple” answer can involve:
- Retrieval
- Model call
- Validation
- Rewrite
- Scoring
- Output generation
Each step looks cheap in isolation. At enterprise scale, the chain becomes the cost. Cloud teams who lived through the early IaaS bills will find this déjà vu uncomfortable.
AI actions: pricing at the UX layer
Vendors are increasingly pricing on UX-level units like actions, assists, conversations or generations. This abstracts tokens entirely and tries to align price with perceived value.
Why AI costs spiral out of control
Hidden model chains
Behind a single answer, systems may run multiple LLM calls: retrieval, rewriting, safety, formatting and evaluation. Users see one click, finance sees multiple metered events.
Unpredictable output size
Output variance drives cost. Extended or reasoning-heavy responses can increase token use and introduce spend volatility.
Context window inflation
Large context windows allow massive inputs. Even if only part is useful, the entire input is billed, which inflates and obscures consumption.
AI agents
Agent loops, retries and tool-calling amplify usage. As agents handle more autonomous tasks, spend becomes more volatile and shifts further away from seat-based models.
Credit abstraction hides true drivers
Credits make it difficult to answer essential questions like:
- What did a workflow cost
- Which team consumed the budget
- Which model is responsible for the spike?
The result is spend that grows faster than governance.
How SaaS vendors monetize AI features
Seat plus AI credit pool
The dominant transitional model. A familiar subscription anchors the deal, and a credit or outcome layer rides on top to capture usage.
Workflow-based pricing
Charges align with the work performed. This pattern is showing up fast as agents start replacing manual effort across sales, service and operations.
Premium AI add-ons
Standalone AI SKUs typically bundle credits and define overage terms in the fine print. The usage risk shifts to the buyer, and negotiation gets more complicated.
API billing
The most transparent model and also the most volatile at scale. The dynamics mirror early cloud consumption: a great option for builders, a tough one for forecasters.
Why enterprises lose control of AI spend
A few patterns show up in nearly every customer conversation:
- Fragmented visibility across vendors
- No standard metrics across tokens, credits and actions
- Users lack cost-per-click awareness
- Chargeback models break when usage is nonlinear
- Procurement lacks precise definitions for terms like action, conversation or credit
What leaders must do now
For executives
Treat AI consumption as a strategic financial risk, not an experimental line item. Push for usage definitions, reporting clarity and audit rights inside every AI contract you sign.
For FinOps and procurement
Model best- and worst-case consumption scenarios before signing. Put caps, alerts and guardrails in place from day one, and monitor continuously. AI spend that surfaces at renewal is AI spend you’ve already lost control of.
For practitioners
Educate teams on the cost mechanics of prompt size, context windows, model choice and agent loops. The behavior change starts with awareness.
Conclusion
AI has reshaped SaaS economics from predictable seat-based billing to volatile consumption patterns driven by tokens, credits and agentic workflows. This is a structural shift, not a temporary phase. Organizations that succeed will treat AI consumption as something that can be measured, forecasted, negotiated and optimized.