Scope (Cost Optimization)
What is in scope?
The Cost Optimization pillar of WAF++ addresses:
| Area | Scope |
|---|---|
Cloud Infrastructure Costs |
Compute, storage, network, database, managed services of all major cloud providers (AWS, Azure, GCP, on-premises hybrid). |
Observability Costs |
Logging, tracing, metrics, APM tools. Often disproportionately expensive due to uncontrolled retention. |
Data Transfer & Egress Costs |
Cross-region, cross-AZ, internet egress, CDN. Often underestimated and only visible in the bill. |
Software License Costs (cloud-bound) |
Bring-Your-Own-License (BYOL), provider marketplace licenses, enterprise agreements with cloud relation. |
Operational Effort (FTE) |
Operating hours directly attributable to infrastructure. Part of the TCO model. |
Architectural Cost Debt |
Cost structures frozen by past architecture decisions. Documented in the cost debt register. |
FinOps Governance |
Review cycles, ownership structures, budget processes, ADR cost impact assessments. |
What is not in scope
| Area | Reason |
|---|---|
General IT operational costs |
Data center, hardware, network infrastructure on-premises without cloud relation: in the Operations pillar. |
HR and personnel costs (total) |
FTE costs as a TCO component are in scope; general salary planning is HR/Finance. |
Business case creation for projects |
ROI decisions lie with Product/Finance. WAF++ provides the cost data basis, not the decision. |
Cloud provider contract negotiations |
Enterprise agreements, discount negotiations are procurement/purchasing. |
Security costs as savings |
Security investments are assessed in the Security pillar, not as a cost optimization measure. |
Brownfield vs. Greenfield: Fundamental Differences
The application of cost optimization measures differs fundamentally depending on the starting situation.
| Dimension | Brownfield (existing infrastructure) | Greenfield (new build) |
|---|---|---|
Tagging enforcement |
Retroactive tagging requires discovery, inventory, coordinated rollout campaign. Risk: untagged legacy resources remain invisible. |
Tagging from day 0 as an IaC standard. No deployment without mandatory tags in the CI gate. |
Architectural Cost Debt |
High probability of existing debt: uncontrolled retention, over-sized reservations, inefficient patterns. |
Cost debt only arises from decisions. Cost impact assessments prevent accumulation. |
Budget control |
Existing budgets often historically grown without clear workload attribution. Restructuring is politically complex. |
Budgets can be structured from the start by workload, environment and team. |
Rightsizing |
Many resources unreviewed for years. Quick wins through simple downsizing measures often realistic. |
Resources are initially sized based on measured load. First reviews after 30–90 days of operation. |
Reservations |
Existing reservations possibly wrong for current workloads. Analysis and restructuring required. |
Commitments only after 30 days of baseline measurement. No premature long-term reservations. |
Compliance effort |
Higher: discovery phase, inventory, political coordination with affected teams. |
Lower: standards implemented from the start; no legacy debt. |
Time horizon |
Structural improvements: 6–18 months. Quick wins (idle shutdown, rightsizing): 30–90 days. |
Full compliance: 0–90 days (if implemented consistently). |
Brownfield Decision Tree
Brownfield Cost Assessment Start
│
├── Step 1: Check tagging compliance
│ ├── < 50% tagged → Tagging campaign before all other measures
│ └── > 80% tagged → Continue to step 2
│
├── Step 2: Check budget alerting
│ ├── No budget → Set up budget via IaC immediately (WAF-COST-020)
│ └── Budget exists → Check workload granularity
│
├── Step 3: Idle/Waste Discovery
│ ├── Idle instances present → Shut down or rightsize within 30 days
│ ├── Unused reserved instances → Plan for restructuring
│ └── No retention policies → Lifecycle policies as priority
│
├── Step 4: Identify Architectural Cost Debt
│ ├── High-cost services without clear SLO justification → Cost debt register entry
│ ├── No ADR history on costs → Introduce process for new ADRs
│ └── Known lock-in situations → Assessment and paydown plan
│
└── Step 5: Establish FinOps cycle
├── Monthly reviews not yet established → Start with simple format
└── Architecture board without cost review → Extend with quarterly cost review
Greenfield Checklist (Day 0)
Before the first terraform apply:
-
Tagging taxonomy defined (cost-center, owner, environment, workload, project)
-
Mandatory tag module anchored in IaC template
-
Budget created as IaC resource
-
Lifecycle policies for all storage and log resources as default
-
ADR template with cost impact assessment section available
-
Rightsizing tag requirement defined (with date field)
-
FinOps review cycle anchored in team processes
Multi-Cloud vs. Single-Cloud: Cost Dynamics
Single-Cloud Cost Drivers
| Driver | Description |
|---|---|
Enterprise Agreement dependency |
EA volume discounts create switching barriers. Leaving the provider often means losing significant discounts. |
Proprietary managed services |
The more proprietary services are used (AWS Kinesis, Azure Service Bus, GCP Spanner), the higher the exit costs. |
Skill concentration |
Teams develop deep expertise in one provider. Switching requires requalification. |
Multi-Cloud Cost Drivers
| Driver | Description |
|---|---|
Data transfer costs |
Cross-provider data transfer is expensive. Data moving from AWS to Azure is not free. "Data Gravity" – workloads tend toward the data. |
Duplicated operational competence |
Two providers mean two toolchains, two training budgets, two certification portfolios. |
Single-cloud workarounds |
To avoid dependency, proprietary services are avoided and replaced by self-operated alternatives – often more expensive to operate. |
Consistency overhead |
Shared security policies, monitoring, compliance across two providers: significant governance effort. |
| Multi-cloud is not automatically cheaper or more expensive than single-cloud. The decision must be made based on genuine TCO models that include all cost drivers. See Greenfield FinOps by Design and Architectural Cost Debt. |
Cost Driver Overview
The most common cloud cost drivers by category:
| Category | Typical Drivers | Addressing Control |
|---|---|---|
Compute |
Over-provisioning, continuously running dev/test instances, missing auto-scaling configuration |
WAF-COST-030 |
Storage |
Infinite retention, missing lifecycle policies, forgotten snapshots, redundant buckets |
WAF-COST-040 |
Network / Egress |
Missing VPC endpoints (S3, KMS), public IPs on internal services, no CDN for assets |
WAF-COST-090 |
Observability |
DEBUG logging in production, uncontrolled log retention, no sampling for traces |
WAF-COST-070 |
Database |
Over-sized RDS instances, multi-AZ without SLO requirement, missing reserved instances |
WAF-COST-030, WAF-COST-080 |
Architectural Debt |
HA/multi-region without SLO basis, lock-in services without exit plan, historical reservations |
WAF-COST-050, WAF-COST-100 |