Everyone knows “cloud scales.” But how costs scale is less understood.
10x users doesn’t mean 10x costs. Sometimes it’s 3x (economies of scale). Sometimes it’s 15x (you hit a cliff). The shape of your cost curve determines whether growth is profitable or ruinous.
The Myth of Linear Scaling ¶
The naive model:
Users: 1x 10x 100x
Costs: 1x 10x 100x
Cost/user: Same Same Same
This is almost never true. Real cost curves have:
- Economies of scale: Costs grow slower than usage
- Step functions: Costs jump at certain thresholds
- Diseconomies: Costs grow faster than usage
Understanding which regime you’re in changes everything.
Economies of Scale ¶
Fixed Cost Amortization ¶
Some costs don’t increase with scale:
Platform team: $1.5M/year (fixed)
Base infrastructure: $200K/year (fixed)
Licensing (site): $100K/year (fixed)
Total fixed: $1.8M/year
At 10K users: $180/user
At 100K users: $18/user
At 1M users: $1.80/user
Fixed costs spread across more users = lower cost per user.
Volume Discounts ¶
Cloud providers reward scale:
AWS compute (example):
First 1M requests: $0.20 per 1K
Next 9M requests: $0.15 per 1K (-25%)
Over 10M requests: $0.10 per 1K (-50%)
1M requests: $200
10M requests: $1,550 (not $2,000)
100M requests: $10,550 (not $20,000)
Committed use discounts amplify this:
On-demand: $1.00/hour
1-year commit: $0.60/hour (-40%)
3-year commit: $0.40/hour (-60%)
At scale, you can commit confidently and capture deeper discounts.
Shared Services ¶
One monitoring system serves all teams:
Monitoring cost: $100K/year
5 services: $20K per service
50 services: $2K per service
500 services: $200 per service
Shared services have massive economies of scale.
Efficiency Gains ¶
Scale enables efficiency investments:
At small scale:
Manual deployments (cheap, but doesn't scale)
At medium scale:
Basic automation ($50K to build)
Saves $100K/year at current size
At large scale:
Advanced automation ($200K to build)
Saves $1M/year at current size
Investments that don’t make sense at small scale become obvious at large scale.
The Economy Curve ¶
Cost per
user
^
|*
| *
| *
| *
| **
| ***
| ****
| *****
| **********
+---------------------------------> Users
10K 50K 100K 500K 1M
This is the good scenario. Growth is self-funding.
Step Functions ¶
Not all costs scale smoothly. Some jump at thresholds.
Database Tiers ¶
Small database: $500/month (handles 1K QPS)
Medium database: $2,000/month (handles 5K QPS)
Large database: $10,000/month (handles 20K QPS)
Cluster: $50,000/month (handles 100K QPS)
Cost doesn’t scale linearly with queries:
QPS Cost Cost/QPS
1K $500 $0.50
4K $500 $0.125 ← Efficient
5K $2,000 $0.40 ← Step!
10K $2,000 $0.20 ← Efficient
20K $10,000 $0.50 ← Step!
Team Size Jumps ¶
1-5 engineers: Self-organizing, minimal overhead
6-10 engineers: Need team lead (+1 person)
11-20 engineers: Need manager, processes (+2 people)
21-50 engineers: Need multiple teams, directors (+5 people)
50+ engineers: Need org structure, VPs (+10 people)
Management overhead grows in steps, not linearly.
Architecture Rewrites ¶
Monolith: Handles up to 10K concurrent users
Cost: What you have now
Distributed: Handles up to 100K concurrent users
Cost: 6-month rewrite + operational complexity
Global: Handles 1M+ concurrent users
Cost: Another 6-month project + more complexity
Architecture transitions are expensive step functions.
Infrastructure Tiers ¶
Single region: $X (simple)
Multi-region: $3X (redundancy + networking)
Global: $10X (everywhere, all the time)
Each tier is a step change in cost and complexity.
The Step Function Curve ¶
Cost
^
| *****
| ****
| ****
| ****
| *---|
| *---
| *---
|*---
+---------------------------------> Users
^ ^ ^
| | |
Database Team Architecture
upgrade growth rewrite
Steps create “cliffs” where small growth triggers large costs.
Diseconomies of Scale ¶
Sometimes bigger means more expensive per unit.
Coordination Costs ¶
2 people: 1 communication path
5 people: 10 communication paths
10 people: 45 communication paths
50 people: 1,225 communication paths
Communication overhead grows O(n²). Meetings, syncs, documentation, alignment—all get more expensive.
Complexity Costs ¶
5 services: Simple dependency graph
50 services: Complex interactions
500 services: Nobody understands the full system
Debugging time, incident resolution, and cognitive load all increase non-linearly.
Blast Radius ¶
Small system: Incident affects 1K users
Large system: Incident affects 1M users
Impact scales with size, requiring more investment in reliability.
The Diseconomy Curve ¶
Cost per
user
^
| *****
| *****
| *****
| *****
| *****
| *****
| ****
|**
+---------------------------------> Users
10K 50K 100K 500K 1M
This is the dangerous scenario. Growth becomes unprofitable.
Real Cost Curves ¶
Most systems have all three patterns:
Cost per
user
^
|* Diseconomy
| * (complexity)
| ** *****
| ** *****
| *** ****
| **** *****
| ***---*
| ****
| *** ^
| **** |
| **** Economy of scale
+---------------------------------> Users
^ ^
| |
Economy Step function
of scale (architecture)
The art is:
- Extend economies of scale as long as possible
- Prepare for step functions before you hit them
- Avoid diseconomies through smart architecture
Leverage Points ¶
Leverage points are investments that change the shape of your cost curve.
Automation ¶
Before automation:
Cost to deploy: $100 (manual process)
Deploys/month: 100
Monthly cost: $10,000
After automation ($50K investment):
Cost to deploy: $1 (automated)
Deploys/month: 1,000
Monthly cost: $1,000
Payback: 5 months
Automation converts variable costs to fixed costs, enabling economies of scale.
Caching ¶
Before caching:
Database queries: 1M/day
Cost per query: $0.001
Daily cost: $1,000
After caching (90% hit rate):
Database queries: 100K/day
Cache cost: $100/day
Total daily cost: $200
Savings: 80%
Caching shifts load from expensive resources to cheap resources.
Architecture ¶
Monolith at 10K users:
Single large instance: $5,000/month
Scales vertically: $$$ per increment
Microservices at 10K users:
Multiple small instances: $6,000/month
Scales horizontally: $ per increment
Microservices cost more initially but scale more efficiently.
Multi-tenancy ¶
Single-tenant:
Cost per customer: $500/month (dedicated resources)
100 customers: $50,000/month
Multi-tenant:
Base cost: $10,000/month
Per-customer increment: $50/month
100 customers: $15,000/month
Savings: 70%
Multi-tenancy is a leverage point for SaaS businesses.
Planning for Scale ¶
Map Your Cost Curve ¶
For each major cost driver, understand the shape:
Component Current 10x Scale Shape
--------- ------- --------- -----
Compute $10K $40K Economy
Database $5K $50K Step function
Storage $2K $15K Linear
Bandwidth $1K $5K Economy
Support $20K $150K Diseconomy
Identify Upcoming Step Functions ¶
Current state: 5K QPS
Database tier threshold: 10K QPS
Time to threshold: 6 months
Cost impact: 5x database cost
Lead time to prepare: 3 months
Action: Start planning now
Calculate Unit Economics at Scale ¶
Current:
Revenue per user: $10/month
Cost per user: $3/month
Margin: 70%
At 10x (with economies):
Revenue per user: $10/month
Cost per user: $1.50/month
Margin: 85%
At 100x (hitting diseconomies):
Revenue per user: $10/month
Cost per user: $4/month
Margin: 60%
Know where your margin peaks and where it starts declining.
Invest in Leverage Points ¶
Prioritize investments that improve the cost curve:
Option A: New feature
Revenue impact: +$500K/year
Cost impact: +$100K/year
Net: +$400K/year
Option B: Caching layer
Revenue impact: $0
Cost curve impact: Reduce slope by 30%
Current trajectory: $200K/year cost growth
New trajectory: $140K/year cost growth
10-year impact: $600K saved
Option B is better if you're scaling.
Unit Economics ¶
The ultimate test: unit economics at scale.
Revenue per user: $10/month
Cost per user (all-in): $?
Contribution margin = Revenue - Variable costs
Gross margin = Revenue - (Variable + Allocated fixed costs)
Track unit economics as you scale:
Users Rev/User Cost/User Margin
1K $10 $5.00 50%
10K $10 $3.00 70%
100K $10 $2.00 80%
500K $10 $2.50 75% ← diseconomy kicking in
1M $10 $3.50 65% ← need to address
When margins start declining at scale, you’re hitting diseconomies. Time to invest in leverage points.
Summary ¶
Infrastructure costs don’t scale linearly:
| Pattern | What Happens | Example |
|---|---|---|
| Economy of scale | Costs grow slower than users | Fixed costs, volume discounts |
| Step function | Costs jump at thresholds | Database tiers, team size |
| Diseconomy | Costs grow faster than users | Coordination, complexity |
Planning for scale:
1. Map your cost curve by component
2. Identify upcoming step functions
3. Calculate unit economics at scale
4. Invest in leverage points (automation, caching, architecture)
5. Monitor margins as you grow
The companies that scale profitably understand their cost curve shape and invest to improve it.
10x users should mean 5x costs, not 15x.
That’s the difference between a business that scales and one that doesn’t.