Cloud Cost Optimization: How to Cut AWS, GCP, and Azure Bills Without Hurting Performance
Cloud bills are one of the most controllable costs in a technology business, yet most engineering teams leave 20–40% of their spend on the table because optimization feels like plumbing work nobody owns. This guide gives you the specific levers to pull, in order of impact and implementation effort, without architectural risk.
Why Cloud Overspend Is So Common
The root cause is almost never waste from a single bad decision. It is accumulated drift: an engineer provisions an m5.4xlarge for a load test and forgets to downsize it; a team spins up a dev environment and never terminates it; a database runs Multi-AZ in a staging environment because the Terraform module defaults to it. Each decision is $15–$200/month. Multiply across a 50-person engineering organization making these decisions dozens of times per quarter and you get a $40,000/month bill that nobody explicitly authorized.
The second cause is the absence of ownership. Cloud costs are shared infrastructure. No individual engineer feels the pain of an oversized instance the way they would feel the pain of a slow API endpoint affecting their users. Unless someone's job includes watching the bill, the bill grows.
The good news: the savings are real and largely non-disruptive. You do not need to re-architect your system to cut 25% of your bill. You need to do the plumbing work.
The Four Categories of Cloud Spend
Before optimizing anything, understand where the money goes. In a typical B2B SaaS company, cloud spend breaks down roughly as follows:
| Category | Typical % of Bill | Optimization Potential | Effort |
|---|---|---|---|
| Compute (EC2, GCE, Azure VMs, Fargate) | 40–55% | 25–50% | Low–Medium |
| Managed databases (RDS, Cloud SQL, Aurora) | 20–30% | 20–40% | Low–Medium |
| Storage (S3, GCS, Azure Blob) | 5–15% | 30–60% | Low |
| Data transfer and networking | 5–15% | 20–40% | Medium–High |
| Managed services (EKS, CloudFront, SQS, etc.) | 10–20% | 10–30% | Variable |
Start with compute and databases. They represent 60–85% of most bills and have the clearest, lowest-risk optimization paths.
Compute Optimization: The Biggest Lever
Reserved Instances and Savings Plans: Zero-Risk 20–30% Savings
If you are running consistent compute workloads on on-demand pricing and have not purchased Reserved Instances or Savings Plans, you are leaving 20–30% of your compute spend on the table with zero operational change required.
The mechanics: cloud providers discount long-term capacity commitments. AWS Compute Savings Plans at the 1-year no-upfront tier deliver ~20% discount vs on-demand. At 3-year all-upfront, up to 56% discount. GCP Committed Use Discounts (CUDs) for 1-year provide 37% savings on compute; for 3-year, up to 55%. Azure Reserved VMs offer 36–45% over 1–3 years.
Practical approach: run your cloud provider's recommendations engine (AWS Cost Explorer Savings Plans recommendations, GCP Cost Optimization recommendations, Azure Advisor). The tool analyzes your actual usage over the prior 30 days and recommends a commitment level. Start conservatively — commit to 70–80% of your baseline, not 100%. This gives you headroom to downsize workloads without stranded commitment.
Expected savings: $3,000–$15,000/month for a company spending $50,000/month on compute. Timeline to savings: visible on next invoice.
Rightsizing: Match Instance Size to Actual Usage
The most common waste pattern in cloud infrastructure is instances running at 5–15% average CPU utilization because they were sized for peak load that never materializes. An m5.2xlarge at $0.384/hour running at 8% average CPU is delivering the performance of an m5.large ($0.096/hour) at a 4x cost premium.
How to identify candidates: AWS Compute Optimizer, GCP Recommender, and Azure Advisor all provide rightsizing recommendations based on actual utilization metrics. Set the analysis window to 14 days minimum to capture weekly patterns. Flag any instance with p99 CPU below 40% as a candidate for downsizing.
The risk: rightsizing requires understanding burst patterns. An instance that averages 10% CPU but spikes to 85% for 5 minutes each hour must be sized for the spike, not the average. Always check p99 and max CPU, not just average. For web servers behind load balancers, scale out horizontally (more small instances) rather than scaling up — this also improves availability.
Expected savings: 15–25% of compute spend. Timeline: 2–4 weeks for analysis and rollout.
Spot and Preemptible Instances for Batch Workloads
Spot Instances (AWS) and Preemptible VMs (GCP) offer 60–90% discounts vs on-demand pricing in exchange for the provider's right to reclaim the instance with 2-minute notice. For workloads that tolerate interruption, this discount is substantial.
Excellent Spot candidates: CI/CD build runners, ML training jobs, data ETL pipelines, nightly batch reports, video transcoding queues, and dev/test environments. Poor candidates: production databases, real-time API servers without graceful shutdown, anything with sessions that cannot be checkpointed.
For AWS, use Spot Instance pools across multiple instance families and availability zones to minimize interruption probability. Spot interruption rates vary by region and instance type — check the AWS Spot Instance Advisor before committing to a pool. In practice, well-designed multi-pool Spot configurations see interruption rates below 5% of instance-hours.
Expected savings: 60–80% on eligible workloads, which typically represent 15–30% of total compute spend. Net effect: 10–20% reduction in total compute bill.
Database Optimization: Often the Biggest Surprise
RDS and Aurora: The Oversizing Problem
Managed databases are frequently over-provisioned because database administrators default to "better safe than sorry" sizing at launch, then never revisit. An RDS db.r6g.4xlarge (128GB RAM, $1.04/hour) is $912/month. If your actual peak memory usage is 40GB, a db.r6g.xlarge (32GB RAM, $0.26/hour) with read replica handles the load at $456/month — 50% savings.
RDS Reserved Instances deliver the same discount structure as EC2 Reserved Instances: 28–40% on 1-year commitments. Apply these after rightsizing — buy reservations for the right size, not the current size.
Multi-AZ in non-production environments is one of the cleanest savings opportunities. Multi-AZ doubles your database cost for standby failover capability. In dev, staging, and QA environments, this is rarely warranted. Auditing and disabling Multi-AZ on non-production databases typically saves $200–$2,000/month depending on instance size and environment count.
Manual Snapshots: Silent Storage Cost
AWS RDS automated snapshots expire per retention window. Manual snapshots do not — they persist until explicitly deleted. Many teams have dozens of manual snapshots taken for point-in-time recovery or pre-deployment safety, never cleaned up. A 500GB database with 20 manual snapshots holds 10TB of snapshot storage at $0.095/GB/month in US-East = $950/month in forgotten snapshots.
Run aws rds describe-db-snapshots --snapshot-type manual and review creation dates. Delete snapshots older than your recovery window requirement. This is immediate savings with zero risk if you understand what the snapshot was for.
Storage Optimization: S3, GCS, and Azure Blob
Intelligent Tiering and Lifecycle Policies
S3 Standard storage costs $0.023/GB/month. S3 Glacier Instant Retrieval costs $0.004/GB/month — an 83% reduction. S3 Glacier Deep Archive is $0.00099/GB/month, a 96% reduction. For data that is accessed infrequently — audit logs, old backups, archived media — moving to appropriate storage classes is pure savings with no access impact.
S3 Intelligent-Tiering automatically moves objects between Standard and Infrequent Access tiers based on access patterns, with no retrieval fees for objects that are accessed. For mixed-access patterns where you cannot predict which objects will be needed, it is the safest automatic optimization. Cost: $0.0025 per 1,000 objects monitored — negligible for most workloads.
Implement lifecycle policies to transition old objects automatically. A typical effective policy: objects over 30 days → Standard-IA; over 90 days → Glacier Instant Retrieval; over 365 days → Glacier Deep Archive (for compliance archives). For a bucket holding 50TB of mixed data, this commonly reduces storage cost by 40–60% within 90 days.
EBS Volume Cleanup
When an EC2 instance is terminated, its root EBS volume may or may not be deleted depending on how the instance was configured. Unattached EBS volumes accumulate silently — gp3 storage costs $0.08/GB/month regardless of attachment status. A 500GB volume lingering after an instance was terminated costs $40/month and provides no value.
Run aws ec2 describe-volumes --filters Name=status,Values=available to list unattached volumes. Review creation dates and any Name tags. Volumes with no name and creation dates over 30 days in the past are almost always safe to delete. Take a final snapshot first if there is any uncertainty.
Data Transfer and Networking Costs
Data transfer is the cost category most engineers underestimate because it is invisible until the bill arrives. Key sources of transfer cost:
Cross-AZ Traffic
AWS charges $0.01/GB for data transferred between Availability Zones. This sounds trivial, but a microservices architecture with chatty inter-service communication can generate hundreds of GB per day of cross-AZ traffic. If your services are running in multiple AZs and making frequent RPC calls to each other, consider deploying services in a single AZ for non-HA workloads, or colocating communicating services in the same AZ.
For services that must be multi-AZ, examine whether service mesh or internal load balancer placement can reduce cross-AZ hops. AWS Application Load Balancers have an availability-zone affinity feature that routes traffic to targets in the same AZ as the source, reducing cross-AZ traffic at the cost of slightly less even load distribution.
Data Egress to the Internet
AWS charges $0.09/GB for the first 10TB of data transferred out to the internet per month. GCP charges $0.08–$0.12/GB. Azure charges $0.087/GB for the first 10TB. If your product delivers large media files, exports, or reports directly from S3 or GCS, a CDN dramatically reduces origin egress costs.
CloudFront costs $0.0085–$0.012/GB for delivery depending on region, and data transferred from S3 to CloudFront within the same region is free. For a product delivering 100TB/month of content: without CloudFront, $9,000/month in S3 egress; with CloudFront, $850–$1,200/month in CDN costs plus $0 in S3-to-CloudFront transfer. Savings: $7,800–$8,150/month per 100TB.
Cross-Cloud Optimization Patterns
| Optimization | AWS Tool | GCP Tool | Azure Tool | Typical Savings |
|---|---|---|---|---|
| Compute commitments | Savings Plans / RIs | Committed Use Discounts | Reserved VMs | 20–55% |
| Rightsizing recommendations | Compute Optimizer | GCP Recommender | Azure Advisor | 10–25% |
| Spot/preemptible compute | Spot Instances | Preemptible VMs | Azure Spot VMs | 60–90% on eligible |
| Storage tiering | S3 Intelligent-Tiering | Cloud Storage classes | Azure Blob tiers | 40–80% on cold data |
| Cost visibility | Cost Explorer + Anomaly Detection | Cloud Billing Reports + Budget Alerts | Cost Management + Advisor | Indirect (prevents future waste) |
Building a FinOps Practice: Making Savings Stick
One-time cloud cost audits produce one-time savings. The bill drifts back up within 6–12 months without ongoing ownership. Building a lightweight FinOps practice — even without a dedicated team — prevents recidivism.
Tag Everything, Budget Per Team
Cloud cost tags (AWS resource tags, GCP labels, Azure tags) enable per-team, per-product, or per-environment cost attribution. Without tags, cost is a single undifferentiated number that nobody owns. With tags, the data team sees their Redshift and Glue costs, the product team sees their ECS and RDS costs, and each team is accountable for their own line item.
Set up budget alerts at 80% of expected monthly spend per tag group. A $500 anomaly alert catches the forgotten dev cluster before it runs for a month unnoticed.
Monthly Cost Review as an Engineering Ritual
Assign one engineer per team to own cloud costs for a quarter. Their job: review the monthly bill, run Trusted Advisor/Recommender, flag any new high-cost resources, and present a 10-minute summary at the monthly all-hands. Rotating ownership distributes the cost awareness across the engineering organization without creating a dedicated FinOps hire below $500,000/year in cloud spend.
Automate the basics: set up Cost Anomaly Detection (AWS), Budget Alerts (GCP/Azure), and weekly spending reports emailed to engineering leadership. These are 30-minute setup tasks that eliminate the most common scenario — discovering a $15,000 overage at month end that was running for 3 weeks.
Where to Start: A Prioritized Action List
- Week 1 — Idle resource cleanup: Run Trusted Advisor, Cost Optimizer, or Azure Advisor. Delete unattached EBS volumes, unused Elastic IPs, idle load balancers, and manual snapshots older than 90 days. Immediate savings, zero risk.
- Week 2 — Purchase Savings Plans: Run Cost Explorer's Savings Plans recommendations. Purchase 1-year Compute Savings Plans for 70–80% of baseline compute. Zero operational change, visible savings next billing cycle.
- Week 3 — Enable storage lifecycle policies: Set S3 lifecycle rules on the top 5 buckets by size. Enable Intelligent-Tiering on mixed-access buckets. Transition log and archive data to Glacier.
- Week 4 — Rightsize top 10 instances: Use Compute Optimizer to identify the highest-waste instances. Rightsize during the next maintenance window. Test at off-peak hours first.
- Month 2 — Implement tagging and budget alerts: Tag all resources by team, environment, and product. Set budget alerts. Assign quarterly cost owners per team.
- Month 3 — Migrate batch workloads to Spot: Identify CI/CD runners, batch jobs, and ML training pipelines. Migrate to Spot Instance groups with multi-pool configurations. Target 60–70% cost reduction on these workloads.
Frequently Asked Questions
How much can a typical B2B company realistically save on cloud costs?
In our experience, most companies spending $20,000–$200,000/month on cloud infrastructure can achieve 25–40% savings within 3–6 months without architectural changes. The easiest wins come from Reserved Instance or Committed Use Discount purchasing (15–30% savings on compute with zero operational change), rightsizing oversized instances (10–20% savings), and eliminating idle resources such as unused EBS volumes, unattached Elastic IPs, and orphaned snapshots (5–10% savings). Companies willing to make architectural investments — containerization, spot instance usage for batch workloads, S3 intelligent tiering — can achieve 40–60% reductions, though these require 3–6 months of engineering work.
What is the fastest way to reduce AWS costs today?
The fastest zero-risk AWS cost reduction is purchasing Savings Plans or Reserved Instances for your baseline compute. Run AWS Cost Explorer's Savings Plans recommendations (takes 5 minutes), review the 1-year no-upfront option, and purchase. This typically produces 20–30% savings on EC2 and RDS with no operational change, visible on your next monthly bill. The second fastest is deleting idle resources: run AWS Trusted Advisor, sort by cost, and terminate unattached EBS volumes, unused Elastic IPs, and NAT Gateways in regions you no longer use. This takes 1–2 hours and commonly yields $500–$5,000/month in immediate savings.
Is using Spot Instances safe for production workloads?
Spot Instances (AWS) and Preemptible VMs (GCP) are safe for certain production workloads if your application is designed to handle interruptions. Stateless web servers, batch processing jobs, CI/CD runners, ML training jobs, and data pipeline workers are all excellent candidates — AWS provides a 2-minute warning before reclaiming a Spot instance, which is sufficient time for graceful shutdown in most stateless designs. Databases, stateful message brokers, and anything requiring persistent connection state are poor candidates. A common architecture: run your baseline capacity on Reserved Instances and add Spot capacity for burst handling, using Auto Scaling groups that mix instance types from multiple pools to minimize interruption probability.
How do Reserved Instances differ from Savings Plans, and which should I buy?
Reserved Instances (RIs) lock you to a specific instance type, region, and sometimes OS. They offer the highest discount (up to 72% vs on-demand for 3-year all-upfront) but require you to forecast your exact capacity needs precisely. Savings Plans are more flexible: Compute Savings Plans apply across EC2, Fargate, and Lambda regardless of instance type or region, at up to 66% discount. EC2 Instance Savings Plans are slightly less flexible but offer up to 72% discount. The recommendation for most teams: buy Compute Savings Plans to cover 70–80% of your baseline compute spend. They accommodate instance family changes and region shifts without forfeiture. Only use RIs if you know with certainty that you will run the same instance type in the same region for 3 years.
Need Help Deciding? Get a Free Estimate from TechConcepts
Cloud cost optimization is one of the highest-ROI engineering investments a B2B company can make. If you want a no-strings-attached assessment of your cloud spend — where the waste is, what to fix first, and what the realistic savings look like — reach out and we will take a look.
Get a free estimate →