LLM Margin Calculator

Last updated: May 2026

Calculate the gross margin of your AI-powered product. See your cost per query, margin per user, recommended price points, and the exact volume needed to hit your target margin.

LLM Cost Inputs

$
$

Additional Costs & Revenue

$
$
$
%

Margin Results

LLM Cost / Query
API cost only
Total Cost / Query
all-in variable cost
Gross Margin
on query revenue
Net Monthly Profit
after fixed costs
Monthly Revenue
Break-Even Volume
queries/mo to cover fixed costs

Gross margin gauge

0%25% (thin)50% (ok)75% (healthy)100%

Cost vs Revenue per query

Pricing Sensitivity Table

How margin changes at different price points for your current usage

Charge/QueryMonthly RevenueGross MarginNet Profit/LossVerdict

LLM Cost Per Query = (Input Tokens × Input Price/MTok + Output Tokens × Output Price/MTok) ÷ 1,000,000

Total Variable Cost Per Query = LLM cost + other per-query costs (infra, embeddings, DB reads, etc.)

Gross Margin = (Revenue per query − Variable cost per query) ÷ Revenue per query × 100. This is the margin before fixed costs — the pure unit economics of your AI product.

Net Monthly Profit = (Revenue per query − Variable cost per query) × monthly queries − fixed monthly costs.

Break-Even Volume = Fixed monthly costs ÷ Gross profit per query. The minimum query volume needed to cover your infrastructure and operations before turning profitable.

Target Price for Margin = Variable cost per query ÷ (1 − target margin %). For a 70% margin target: price = cost ÷ 0.30.

⚠️ LLM pricing changes frequently. Verify current rates with your provider before making pricing decisions. Gross margin shown is on query revenue only and excludes R&D, sales, and marketing costs.

How the LLM Margin Calculator Works

Building AI products on top of LLM APIs means your COGS (Cost of Goods Sold) includes API costs. This calculator computes gross margin, break-even pricing, and pricing sensitivity for AI-powered products.

Gross Margin % = (Revenue per query - API cost per query) / Revenue per query x 100 Break-even price = API cost per query / (1 - target margin %) Monthly API cost = (Input tokens x input rate + Output tokens x output rate) x monthly queries / 1,000,000

Worked example: SaaS product charging $49/month with users averaging 500 queries/month. Each query: 800 input tokens + 400 output tokens using Claude Sonnet 4 ($3/$15 per M). API cost per query = (800 x $0.000003) + (400 x $0.000015) = $0.0024 + $0.006 = $0.0084. Cost per user/month = $0.0084 x 500 = $4.20. Revenue per user: $49. Gross margin: (49 - 4.20) / 49 = 91.4%.

Frequently Asked Questions

What gross margin should an AI SaaS product target?

Traditional SaaS targets 70-85% gross margin. AI SaaS with LLM inference costs typically achieves 60-80% depending on query volume, model choice, and pricing. The key risk is usage-based cost variability — a power user consuming 10x average tokens can erode margin significantly. Best practices: (1) Set usage limits or tiers. (2) Use cheaper models for simpler tasks. (3) Build prompt caching for repeated system prompts. (4) Price per seat + usage overages rather than flat per-seat to protect margins at high usage.

How does model choice affect my product's unit economics?

The difference between frontier and economy models is dramatic. Claude Haiku costs $0.80/$4 per M tokens; Claude Sonnet 4 costs $3/$15 — approximately 4x more expensive. For a product with $5 monthly API cost per user using Sonnet, switching to Haiku for routine queries reduces that to ~$1.25/user, adding $3.75/month in gross profit per user. At 1,000 users, that is $3,750/month in additional margin — a compelling reason to implement model tiering based on task complexity.

How do I account for prompt caching in my margin calculation?

Prompt caching (available on Claude and GPT-4o) stores frequently repeated content so it is not re-billed at full price. If your application has a 2,000-token system prompt sent with every request, enabling caching reduces those tokens from full price to 10% of full price on cache hits. For Claude Sonnet 4: uncached system prompt = $0.006 per request. Cached: $0.0006. At 10,000 daily requests, that saves $54/day ($1,620/month) on system prompt costs alone.

What pricing model works best for AI products?

Three models dominate: (1) Per-seat subscription — predictable revenue, usage variability risk on costs; works well when usage is consistent. (2) Usage-based — charges scale with API costs, no margin compression; but customers dislike unpredictable bills. (3) Hybrid — base seat fee + usage overages above a generous included limit. The hybrid model is increasingly common: a flat fee covers 80% of users; heavy users pay overages that protect your margin. Analyze your user usage distribution before committing to a pricing model.