Almost every cloud bill we audit is too high, and almost always for the same reason: nobody ever went back to check. A team picks an instance size during a launch crunch, the app ships, traffic settles into a pattern, and the infrastructure quietly stays sized for a day that never came. Months later the invoice is a line item nobody questions.
Right-sizing isn't a heroic re-architecture. It's a habit: measure what you actually use, match the resources to it, and leave a sensible margin. Here's the process we run.
Measure before you touch anything
You cannot right-size from a feeling. Pull two to four weeks of real metrics before changing a single resource: CPU and memory utilization, p95 and p99 latency, request volume by hour, and storage growth. Most teams discover their "busy" servers idle at 5 to 15 percent CPU all week, with a brief peak they over-provisioned the whole fleet to cover.
Look at the shape of the load, not just the average. A flat, predictable line and a spiky, bursty one call for completely different answers, and averaging them hides both.
Compute: pay for the shape of your load
Once you can see the load, match it:
The most common win is the most boring one: drop one instance size. A service idling at 10 percent CPU does not need that headroom. Step it down, watch p99 latency for a week, and repeat until you have a comfortable margin rather than a luxurious one.
Storage and bandwidth: the quiet line items
Compute gets the attention; storage and egress quietly pile up.
Leave a margin, not a moat
Right-sizing is not about running everything at 95 percent until the first traffic spike takes you down. The goal is a deliberate margin instead of an accidental one. Headroom should be a decision with a number behind it: enough to absorb a normal spike and a node failure, not enough to host a second copy of your business by accident.
This is where reliability and cost stop being opposites. A well-sized system with autoscaling and a real margin is usually *more* resilient than an oversized static one, because it's designed for the load instead of guessing at it.
Make the cost visible
Waste survives in the dark. The teams with the lowest bills are the ones who can see them:
Estimate before you commit
The best time to right-size is before you provision anything. If you're planning a build or a migration and want a ballpark for compute, storage, bandwidth and the AI or DevOps pieces around it, run the numbers first with our cost calculator, then adjust to your real traffic.
For the bigger picture on moving workloads without nasty surprises, see our Cloud Migration Strategies playbook, and for why this kind of discipline lives in the team rather than a one-time cleanup, DevOps Culture: The Engine of Modern Engineering.
The takeaway
Cloud overspend is rarely one big mistake. It's a hundred small defaults nobody revisited. Measure what you use, match resources to the shape of the load, tier your storage, watch egress, and keep a deliberate margin. Done as a habit, right-sizing routinely takes 30 to 50 percent off a bill while making the system *more* reliable, not less.
If your bill keeps climbing and nobody can quite say why, that's the signal. Talk to us, we'll help you find the waste and right-size around it.