The $6,000 OpenAI bill that appears two weeks after a product launch is becoming a startup cliché — and I say that having talked to multiple founders who went through exactly that. The feature worked great, users kept using it, and nobody had done the math on what "users keep using it" actually costs per month. The AI bill doesn't care about your runway.
The frustrating thing is that the math is not hard. It's just math most teams skip because they're focused on shipping. This guide is about doing that math before you commit to an architecture, a model, or a pricing plan — not after the invoice arrives.
Before you can budget, you need to know what a typical request looks like in token terms. For each AI-powered feature you're planning, estimate:
Total input = system prompt + context + user input. This number, multiplied by your expected request volume, drives most of your cost.
Don't model one scenario. Model three: conservative, expected, and stressed. AI costs scale linearly with usage, and the difference between your expected and stressed scenario matters more than the expected number itself.
| Scenario | Monthly active users | AI requests / user / day | Monthly requests |
|---|---|---|---|
| Conservative | 500 | 3 | 45,000 |
| Expected | 2,000 | 5 | 300,000 |
| Stressed (viral / press) | 10,000 | 8 | 2,400,000 |
The stressed scenario is what matters most for planning. If your AI costs are fine at expected but catastrophic at stressed, you don't have a budget — you have a time bomb.
Once you have your per-request token profile and your usage scenarios, calculate cost per user per month. This is the number that connects AI costs to your business model.
Example: A document summarization feature on GPT-4o.
| Component | Tokens | Rate | Cost per request |
|---|---|---|---|
| Input (doc + prompt) | 3,500 | $2.50/M | $0.00875 |
| Output (summary) | 300 | $10.00/M | $0.00300 |
| Total per request | — | — | $0.01175 |
At 5 requests/user/day × 30 days = 150 requests/user/month × $0.01175 = $1.76 AI cost per user per month.
Now ask: what are you charging? If your plan is $10/month per user, AI is 17.6% of revenue — workable but tight. If you're on a freemium model with no immediate monetization, $1.76/user/month is expensive at scale. That's $17,600/month at 10,000 users.
Once you have your overall model, set a token budget for each AI feature. This becomes an engineering constraint, not just a financial one. Developers should know the target token envelope for each feature the same way they know the target latency.
A simple budget table for a product with multiple AI features:
| Feature | Model | Input budget | Output budget | Cost/request |
|---|---|---|---|---|
| Search answer | GPT-4o mini | 800 tokens | 150 tokens | $0.00021 |
| Doc summary | GPT-4o | 4,000 tokens | 400 tokens | $0.01400 |
| Chat assistant | Claude 3.5 Haiku | 1,500 tokens | 300 tokens | $0.00240 |
| Code review | GPT-4o | 3,000 tokens | 800 tokens | $0.01550 |
With budgets in place, engineers can make informed decisions: "We could add this context to the prompt, but it pushes us over budget — is the quality improvement worth it?"
Estimates drift. Production behavior is almost always different from what you modeled — usually worse. A few things to track from day one:
Set a Slack alert when daily AI spend exceeds a threshold. Most providers offer spend notifications — use them. A 10× spike in AI spend should wake someone up, not show up in the end-of-month invoice review.
Before any AI feature goes to production, every startup founder should be able to answer this: what does my AI bill look like at 10,000 active users?
If the answer is "I'm not sure," do the math before launch. The features that get built on vague cost assumptions are the ones that cause CFOs to ask hard questions about AI spend six months later.
Model your AI costs at any scale — plug in token estimates and request volume to see monthly projections.
Open the Calculator →