WAF++ WAF++
Back to WAF++ Homepage

Evidence & Audit: Operational Excellence

This page describes the evidence required for an audit of the Operational Excellence pillar. Evidence is categorized by type.

Evidence by Type

IaC Evidence (Infrastructure as Code)

Description Required Related Control Format

Pipeline definitions (.github/workflows/, .gitlab-ci.yml, buildspec.yml)

Required

WAF-OPS-010

File link in Git

Terraform remote state configuration

Required

WAF-OPS-020

Terraform code with backend block

S3 state bucket with versioning (or Azure/GCP equivalent)

Required

WAF-OPS-020

Terraform resource

CloudWatch Log Groups with retention (or Azure/GCP equivalent)

Required

WAF-OPS-030

Terraform resource

Load balancer configuration for Blue/Green or Canary

Optional

WAF-OPS-080

Terraform resource

AWS Config Recorder / Azure Policy Assignment

Required

WAF-OPS-090

Terraform resource

Config Evidence (System Configuration)

Description Required Related Control Format

Branch protection configuration (min. reviewer, CODEOWNERS)

Required

WAF-OPS-010, WAF-OPS-050

GitHub/GitLab Settings screenshot or API output

Alert definitions with symptom-based metrics

Required

WAF-OPS-040

Alert rule YAML or Terraform code

Alert definitions with runbook URLs

Required

WAF-OPS-060

Alert rule YAML with runbook_url annotation

AWS AppConfig / Feature Flag service configuration

Optional

WAF-OPS-080

Terraform resource or API export

AWS CloudTrail configuration (multi-region, validated)

Required

WAF-OPS-090

Terraform resource

Process Evidence (Process Records)

Description Required Related Control Format

DORA metrics report (Deployment Frequency, Lead Time, MTTR, CFR)

Optional

WAF-OPS-010

Dashboard screenshot or CSV export

Runbook directory with all service runbooks

Required

WAF-OPS-060

Wiki link or Git directory

Runbook Coverage Report (services with runbooks / total)

Required

WAF-OPS-060

Percentage report or table

Postmortem archive (last 3 months)

Required

WAF-OPS-070

Wiki link or document list

Action item tracking from postmortems

Required

WAF-OPS-070

JIRA filter or GitHub Issues export

Quarterly Operational Debt Review minutes

Required

WAF-OPS-100

Meeting notes or ticket history

Alert noise report (pages/week, actionability rate)

Optional

WAF-OPS-040

PagerDuty/OpsGenie analytics or CSV

Governance Evidence (Policies and Decision Records)

Description Required Related Control Format

Change management policy (categories, approval requirements, freezes)

Required

WAF-OPS-050

Document link (Wiki, Confluence, PDF)

Post-Incident Review policy (trigger, timeline, template, publication)

Required

WAF-OPS-070

Document link

Operational Debt Register (version-controlled)

Required

WAF-OPS-100

Git file (ops-debt-register.yml)

SLO definitions for all critical services

Required

WAF-OPS-040

Document link or YAML file

Deployment freeze policy (critical business periods)

Optional

WAF-OPS-050

Calendar configuration or policy document

Metrics Evidence (Measurement Records)

Description Required Related Control Format

Drift detection log (last 90 days with resolution times)

Optional

WAF-OPS-090

CSV export or ticket history

Toil hours report (weekly per engineer)

Optional

WAF-OPS-100

Table or survey results

Repeat incident rate (same incident class recurring)

Optional

WAF-OPS-070

Incident management system report

Sprint capacity allocation for debt reduction

Optional

WAF-OPS-100

Sprint planning export

Audit Checklist

A quick checklist for auditors and self-assessing teams:

Automation & IaC

  • Pipeline definitions are in version control and do not use inline secrets

  • Terraform remote state is configured and has no local state file

  • Branch protection prevents direct commits to main/master

Observability & Alerting

  • Log groups have retention policies (at least 30 days)

  • Alerts reference symptom-based metrics (error rate, latency)

  • All paging alerts have runbook URLs in their description

Processes

  • Postmortem archive has at least 3 entries from the last 6 months

  • Action items from postmortems have owners and due dates

  • Operational Debt Register is current (last change < 90 days)

Change Management

  • Change management policy is documented

  • Production deployments have approval gates

  • CloudTrail / Azure Activity Log is active for all regions