Last updated: May 2026
Calculate the gross margin of your AI-powered product. See your cost per query, margin per user, recommended price points, and the exact volume needed to hit your target margin.
LLM Cost Inputs
Additional Costs & Revenue
Margin Results
Gross margin gauge
Cost vs Revenue per query
Pricing Sensitivity Table
How margin changes at different price points for your current usage
| Charge/Query | Monthly Revenue | Gross Margin | Net Profit/Loss | Verdict |
|---|
LLM Cost Per Query = (Input Tokens × Input Price/MTok + Output Tokens × Output Price/MTok) ÷ 1,000,000
Total Variable Cost Per Query = LLM cost + other per-query costs (infra, embeddings, DB reads, etc.)
Gross Margin = (Revenue per query − Variable cost per query) ÷ Revenue per query × 100. This is the margin before fixed costs — the pure unit economics of your AI product.
Net Monthly Profit = (Revenue per query − Variable cost per query) × monthly queries − fixed monthly costs.
Break-Even Volume = Fixed monthly costs ÷ Gross profit per query. The minimum query volume needed to cover your infrastructure and operations before turning profitable.
Target Price for Margin = Variable cost per query ÷ (1 − target margin %). For a 70% margin target: price = cost ÷ 0.30.
⚠️ LLM pricing changes frequently. Verify current rates with your provider before making pricing decisions. Gross margin shown is on query revenue only and excludes R&D, sales, and marketing costs.
Building AI products on top of LLM APIs means your COGS (Cost of Goods Sold) includes API costs. This calculator computes gross margin, break-even pricing, and pricing sensitivity for AI-powered products.
Worked example: SaaS product charging $49/month with users averaging 500 queries/month. Each query: 800 input tokens + 400 output tokens using Claude Sonnet 4 ($3/$15 per M). API cost per query = (800 x $0.000003) + (400 x $0.000015) = $0.0024 + $0.006 = $0.0084. Cost per user/month = $0.0084 x 500 = $4.20. Revenue per user: $49. Gross margin: (49 - 4.20) / 49 = 91.4%.
Traditional SaaS targets 70-85% gross margin. AI SaaS with LLM inference costs typically achieves 60-80% depending on query volume, model choice, and pricing. The key risk is usage-based cost variability — a power user consuming 10x average tokens can erode margin significantly. Best practices: (1) Set usage limits or tiers. (2) Use cheaper models for simpler tasks. (3) Build prompt caching for repeated system prompts. (4) Price per seat + usage overages rather than flat per-seat to protect margins at high usage.
The difference between frontier and economy models is dramatic. Claude Haiku costs $0.80/$4 per M tokens; Claude Sonnet 4 costs $3/$15 — approximately 4x more expensive. For a product with $5 monthly API cost per user using Sonnet, switching to Haiku for routine queries reduces that to ~$1.25/user, adding $3.75/month in gross profit per user. At 1,000 users, that is $3,750/month in additional margin — a compelling reason to implement model tiering based on task complexity.
Prompt caching (available on Claude and GPT-4o) stores frequently repeated content so it is not re-billed at full price. If your application has a 2,000-token system prompt sent with every request, enabling caching reduces those tokens from full price to 10% of full price on cache hits. For Claude Sonnet 4: uncached system prompt = $0.006 per request. Cached: $0.0006. At 10,000 daily requests, that saves $54/day ($1,620/month) on system prompt costs alone.
Three models dominate: (1) Per-seat subscription — predictable revenue, usage variability risk on costs; works well when usage is consistent. (2) Usage-based — charges scale with API costs, no margin compression; but customers dislike unpredictable bills. (3) Hybrid — base seat fee + usage overages above a generous included limit. The hybrid model is increasingly common: a flat fee covers 80% of users; heavy users pay overages that protect your margin. Analyze your user usage distribution before committing to a pricing model.