Cost Realism
Evaluates whether the total cost of ownership (build + run + maintain) is sustainable beyond the pilot phase.
AI's cost structure is different from traditional software: expenses rise with every prompt, longer contexts, and more powerful models even as unit prices fall. Many promising pilots fail to scale because runtime costs weren't modeled realistically. Organizations that manage AI spend report ~40% year-over-year growth in usage as they scale. Without clear TCO projections and controls to prevent runaway spending, a successful pilot can become financially unsustainable in production.
What Good Looks Like
✓ Total Cost of Ownership (TCO) broken into clear line items: build, run (tokens/inference/storage), and govern (monitoring/updates)
✓ Usage-based forecasting modeling costs around expected volume with best/base/worst-case scenarios
✓ Controls to prevent runaway spend: budget caps, per-user/request quotas, alerts when thresholds exceeded
✓ Cost/quality trade-off plan defining when to switch to cheaper models or optimize prompts
✓ Named cost owner with tools and process to monitor AI spending
✓ Confirmed pricing tiers/discounts in writing with fallback plans if vendor prices change
✓ Scale math showing how costs change at 10× volume
✓ Sustainable post-pilot funding path (operating budget, payer, or pricing model identified)
What to Watch Out For
✗ Lump sum "AI costs" without line-item breakdown
✗ No usage-based forecasting (e.g., cost per conversation, per user, per query)
✗ Missing costs for monitoring, maintenance, or expertise needed to run the system
✗ No controls to prevent runaway spending (budget caps, alerts)
✗ Assuming current unit prices will stay stable as usage scales
✗ No post-pilot funding plan (what happens when grant money runs out?)
✗ Only accounting for API/inference costs, missing storage, egress, monitoring tools
Tests To Apply
□ Is Total Cost of Ownership (TCO) broken into: build, run (API/inference/storage), and govern (monitoring/updates)?
□ Do they model costs based on expected volume with best/base/worst-case scenarios?
□ Are there hard spending caps and alerts when thresholds are hit?
□ Have they calculated how costs change if adoption doubles or triples?
□ Is there a named cost owner and process for tracking spend?
□ Do they have a sustainable funding plan beyond pilot phase (revenue, budget allocation, or payer identified)?
□ Have they defined when they'll switch to smaller/cheaper models to manage costs?
□ Are pricing agreements confirmed in writing with fallback if vendor changes prices?
Key Questions to Ask
-
What's your total monthly cost at current usage? At 10× usage?
-
How do costs break down: API calls, data storage, staff time, monitoring tools?
-
What controls prevent spending from spiraling if usage grows faster than expected?
-
When pilot funding ends, where does ongoing operating budget come from?
-
What happens if your AI vendor raises prices 3× or shuts down?
Apply the Cross-Cutting Lenses
After evaluating the core criteria above, apply these two additional lenses to assess equity outcomes and evidence quality.
Equity & Safety Check
When evaluating Cost Realism through the equity and safety lens, assess whether cost pressures could force corner-cutting that harms vulnerable users.
Gate Assessment:
🟢 CONTINUE: Safety and monitoring fully funded, cost controls don't compromise equity
🟡 ADJUST: Budget tight but safety protected, watching closely for squeeze points
🔴 STOP: No safety/monitoring budget, or cost pressures already forcing harmful trade-offs
Check for:
□ Are monitoring and safety costs explicitly budgeted (not the first thing cut when over budget)?
□ Could cost overruns lead to reducing support for languages, accessibility features, or underserved geographies?
□ Is there a named owner who can halt operations if costs exceed budget (preventing "sunk cost" pressure to continue)?
□ Are rollback triggers tied to cost thresholds (e.g., "if costs exceed X, we pause to reassess")?
□ Do they have contingency budget for incident response and user support (not just inference costs)?
□ Are there safeguards against choosing cheaper models that perform worse for certain subgroups?
Evidence & Uncertainty Check
When evaluating Cost Realism through the evidence and uncertainty lens, assess whether cost projections are backed by data and whether uncertainty ranges are realistic.
Quality Grade:
🅰️ A (Strong): Evidence-based cost model with scenarios, sensitivity analysis, independent validation, secure post-pilot funding
🅱️ B (Moderate): Reasonable cost estimates with some scenarios, plan to monitor and adjust
🅲 C (Weak): Single-point cost estimate with no uncertainty, no post-pilot plan—high financial risk
Check for:
□ Are costs modeled based on expected volume with best/base/worst-case scenarios (not just one number)?
□ Are uncertainty bands shown (e.g., "costs could range from $X to $Y depending on adoption")?
□ Have they calculated sensitivity to key drivers (if usage doubles, if model prices rise 50%, etc.)?
□ Is there evidence from similar deployments to validate cost assumptions (not just vendor quotes)?
□ Do they acknowledge what they DON'T know about future costs (e.g., model deprecation, API changes)?
□ Are there independent cost estimates (not just the vendor's projection)?
□ Is post-pilot funding path documented with evidence it's secure (not just "we'll find more grants")?
