WAF++ WAF++
Back to WAF++ Homepage

Cost Optimization Principles

The following seven principles form the foundation of the Cost pillar in WAF++. They are formulated in a provider- and technology-agnostic way and apply to both brownfield and greenfield scenarios.

CP1 – Transparency First

Costs that are not visible cannot be optimized.

Every cloud resource must be clearly attributed to a workload, a team and an environment. Cost transparency is not optional – it is the fundamental prerequisite for all other cost controls.

Transparency means concretely:

  • Tagging taxonomy defined and enforced via IaC (no deployment without mandatory tags)

  • Cost allocation groups configured in cloud provider billing tools

  • Chargeback or showback model established for internal cost distribution

  • Cost anomalies are detected automatically – not discovered manually in the monthly report

Implication: "We see our total bill" is not transparency. A fully tagged, workload-segmented cost dashboard with alert thresholds is transparency.

Related controls: WAF-COST-010, WAF-COST-020


CP2 – Ownership

Every resource and every workload has a clear cost owner.

Without ownership, cost optimizations are not implemented – it is never clear who is responsible. Cost responsibility must lie where architecture decisions are made: with the engineering team.

Ownership means concretely:

  • The owner tag on every resource refers to a concrete team (not a department)

  • Cost owners are part of FinOps review cycles and receive budget alerts

  • Budget overruns trigger a direct escalation to the team owner

  • Ownership is integrated into onboarding processes for new services

Implication: "The FinOps team is responsible for costs" is an anti-pattern. FinOps supports engineering teams – cost responsibility remains with the team.

Related controls: WAF-COST-010, WAF-COST-060


CP3 – Cost-Aware Architecture

Architecture decisions have long-term economic impacts. These must be assessed.

Every decision for an infrastructure component, a managed service or a deployment pattern brings cost structure with it: fixed costs, variable costs, transfer costs, operational costs, exit costs. These do not arise by chance – they are the result of architectural decisions.

When these economic impacts are not assessed, Architectural Cost Debt accumulates: cost structures that become embedded in the architecture and can later only be changed with significant effort.

Cost-Aware Architecture means concretely:

  • Every ADR with infrastructure impact includes a structured cost impact assessment

  • TCO, lock-in risk, data transfer costs, operational effort and exit costs are explicitly assessed

  • HA and multi-region decisions are made based on SLOs – not hypothetically

  • Open source alternatives are evaluated on equal footing

  • Known cost debts are documented in the cost debt register

Implication: A decision for a high-priced managed service without an exit plan is not an architecture decision – it is cost debt that someone will have to pay later.

Related controls: WAF-COST-050, WAF-COST-100

Further details: Architectural Cost Debt


CP4 – Continuous Optimization

Costs are not a one-time optimization project. They are continuously reviewed and reduced.

Cloud infrastructure changes constantly: new services, growing data volumes, changed usage patterns. What is optimally sized today may be over-provisioned in six months.

Continuous Optimization means concretely:

  • Monthly engineering reviews with concrete optimization actions and owners

  • Quarterly architecture board reviews to assess structural cost drivers

  • Rightsizing tags on compute resources with the date of the last review

  • Cost debt register reviewed and updated quarterly

  • Automated idle detection and rightsizing recommendations as input for reviews

Implication: A one-time rightsizing project before annual planning is not a process. A monthly review cycle with an action item tracker is a process.

Related controls: WAF-COST-030, WAF-COST-060, WAF-COST-100


CP5 – Automation First

Budget controls, alerts and optimization actions are automated – not manual.

Manual cost control is error-prone and does not scale. Budgets set manually in the cloud console UI are not reproducible, not versioned and not auditable.

Automation First means concretely:

  • All budget definitions are IaC-managed (Terraform aws_budgets_budget, Azure consumption_budget, GCP billing_budget)

  • Alerts are automatically routed to owner channels (Slack, email, PagerDuty)

  • CI gates check tagging compliance on every pull request

  • Lifecycle policies for storage and logs are automated – no manual archiving

  • Idle resources are automatically identified and proposed for shutdown

Implication: A budget that is only visible in the billing dashboard provides no operational control. A budget as a Terraform resource with alert notification and automatic ticket creation is control.

Related controls: WAF-COST-020, WAF-COST-040, WAF-COST-070


CP6 – Right-Size, Not Over-Size

Resources are sized according to actual demand – not hypothetical peak scenarios.

Over-provisioning is the most common and costly form of cloud waste. Systems dimensioned for 10x growth that never materialized pay the price of that decision every month.

Right-Size, Not Over-Size means concretely:

  • Sizing decisions are based on measured utilization (P95/P99), not estimates

  • SLO/SLA requirements drive HA and redundancy decisions – not caution

  • Reservations are made based on >= 70% utilization over 30 days – not as a default

  • Spot/Preemptible Instances for variable workloads; on-demand only for unpredictable peaks

  • Rightsizing reviews are documented and traceable (tag rightsizing-reviewed with date)

Implication: HA across three Availability Zones without an SLO requirement that demands more than one AZ is not a resilience investment – it is Architectural Cost Debt.

Related controls: WAF-COST-030, WAF-COST-080


CP7 – Full Cost View

The total cost of a workload includes infrastructure, licenses, operational effort, skills and exit costs.

Cloud bills show only a fraction of actual costs. Operational effort (in FTE hours), license costs, vendor management, training effort and potential exit costs are systematically invisible in many cost comparisons.

Full Cost View means concretely:

  • TCO calculations include: infrastructure + licenses + FTE effort (ops + engineering) + vendor management + exit costs

  • ROI assessments refer to the value of the workload (revenue, risk reduction, compliance), not just infrastructure costs

  • Multi-cloud scenarios are assessed on actual total costs: data transfer between providers, duplicated operational competence, enterprise agreement losses

  • Open source alternatives are assessed on total costs: license savings vs. operational effort, support costs, missing managed service convenience

  • Lock-in costs are recorded as hidden liabilities: the higher the lock-in, the higher the notional exit costs

Implication: "AWS is more expensive than on-premises" is often wrong when you include colocation, power, hardware amortization, staff and outage costs. Equally: "Open source is free" ignores operational effort.

Related controls: WAF-COST-050, WAF-COST-060, WAF-COST-100