Performance Principles
The 7 performance principles of WAF++ are guidelines for technical decisions. They describe the why behind the controls and enable informed judgments in situations that no control directly addresses.
PP1 – Measure First
Tagline: Without measurement, optimization is guesswork.
Explanation
Performance optimization without measurement is one of the most common forms of resource waste in software development. Developers optimize code paths that were never measured and miss the actual bottlenecks. The first principle of WAF++ is: measure before optimizing.
Measuring means:
-
Collecting baselines: Document P50, P95, P99 latency, throughput, and error rate before every optimization
-
Identifying bottlenecks: Which layer is the actual constraint? CPU? Network? Database? Cache?
-
Defining success: What result counts as an improvement? Without a target, no progress can be measured
-
Continuous measurement: Collect metrics permanently, not only during incidents
Concrete implications
-
Before every sizing decision: collect 2 weeks of utilization data
-
Before every caching implementation: measure cache miss rate and query frequency
-
Before every index optimization: run
EXPLAIN ANALYZEon the actual queries -
Before every auto-scaling tuning: measure the load profile and derive scaling thresholds
Related Controls
-
WAF-PERF-050 – Performance Monitoring & SLO Definition
-
WAF-PERF-040 – Database Performance Baseline & Index Strategy
PP2 – Right Technology for the Job
Tagline: The best technology for a use case is not the most powerful – it’s the most suitable.
Explanation
Every technology decision has performance implications. A relational database for a key-value store is just as wrong as a NoSQL document store for complex transactions. The principle: choose the technology that is sufficiently powerful and optimally suited for the specific use case – not the theoretically fastest one.
Concrete implications
-
For variable load: evaluate serverless before classic compute (WAF-PERF-080)
-
For caching: Redis for structured data, CDN for content, in-process for read-heavy lookups
-
For storage: gp3 for general compute, io2 for high-load databases (WAF-PERF-090)
-
For global distribution: evaluate CDN before multi-region deployment (WAF-PERF-070)
Related Controls
-
WAF-PERF-010 – Compute Instance Type & Sizing Validated
-
WAF-PERF-080 – Serverless & Managed Services for Variable Load
PP3 – Scale Horizontally
Tagline: More identical instances beats a single larger instance.
Explanation
Vertical scaling (larger instance) has limits: eventually there is no larger instance. It also creates single points of failure, has downtime during resizing, and provides no redundancy. Horizontal scaling (more instances behind a load balancer) is elastic, has no hard upper limit, and provides natural redundancy.
Concrete implications
-
Stateless architecture is a prerequisite for horizontal scaling
-
Session state MUST be externalized (Redis, DynamoDB) – not kept in-process
-
Database scaling: read replicas for read scaling, sharding for write scaling
-
Stateful services (databases, message queues): use provider-level auto-scaling
Related Controls
-
WAF-PERF-020 – Auto-Scaling Configured & Tested
-
WAF-PERF-080 – Serverless & Managed Services for Variable Load
PP4 – Cache Strategically
Tagline: Don’t cache everything – but cache everything that repeats.
Explanation
Caching is one of the most effective performance optimizations when applied correctly. Applied incorrectly, it creates hard-to-diagnose stale data problems. The principle: cache consciously and with documentation – with clear rules for candidates, TTLs, and invalidation.
Caching candidates:
-
Always cache: Static assets, immutable configuration, external API responses with tolerance
-
Cache with short TTL: Aggregated metrics, product catalogs, price lists
-
Do not cache: User-specific transaction data, real-time information, security decisions
Concrete implications
-
Caching strategy MUST be documented with data type, layer, TTL, and invalidation logic
-
Cache hit rate MUST be measured; target: >= 80% application cache, >= 95% CDN
-
Cache invalidation on data mutations is mandatory – no "we just won’t cache it"
-
Thundering herd protection for cache expiry events on high-frequency keys
Related Controls
-
WAF-PERF-030 – Caching Strategy Defined & Implemented
-
WAF-PERF-070 – Network Latency & Topology Optimization
PP5 – Test Under Load
Tagline: An untested scaling path is not a scaling path.
Explanation
Every architecture behaves differently under load than at idle. Connection pools saturate, auto-scaling triggers are delayed, database query plans change with data volume. The principle: no production-critical system goes to production without load test validation.
Load tests reveal:
-
Auto-scaling configuration errors (thresholds too high, cooldowns too long)
-
Connection pool exhaustion under concurrent load
-
Database performance degradation from query plan changes under load
-
Memory leaks and garbage collection pauses under extended load
Concrete implications
-
Load tests are a deployment gate: production deployment without a passed load test is a violation
-
Acceptance criteria MUST be defined before the load test: P95 < X ms, error rate < Y%
-
Baseline for regression comparison MUST be updated after every successful test
-
Stress test (2x, 5x expected load) at least quarterly for critical services
Related Controls
-
WAF-PERF-060 – Load & Stress Testing in CI/CD Pipeline
-
WAF-PERF-020 – Auto-Scaling Configured & Tested
PP6 – Automate Scaling
Tagline: Manual capacity management is a 24/7 operational problem.
Explanation
Manual scaling requires human intervention at any time of day or night, is error-prone, slow, and creates response time gaps during unexpected traffic spikes. Automated scaling with validated configurations is not only more efficient – it is more reliable than any person in an on-call rotation.
Concrete implications
-
All stateless production workloads MUST have auto-scaling configured
-
Scaling metrics MUST be based on application behavior (latency, request rate, queue depth)
-
Scaling thresholds MUST be derived from load test data – not intuition
-
Scale-in MUST be more conservative than scale-out (cooldown: scale-out shorter than scale-in)
Related Controls
-
WAF-PERF-020 – Auto-Scaling Configured & Tested
-
WAF-PERF-080 – Serverless & Managed Services for Variable Load
PP7 – Performance as Architecture Concern (Document Performance Debt)
Tagline: Known performance limitations are debt – and debt grows with interest.
Explanation
Performance debt arises when architectural decisions or resource constraints lead to known performance limitations that are consciously accepted. This debt is rational – but only if it is documented and regularly reviewed.
Undocumented performance debt:
-
Is forgotten and accumulates unnoticed
-
Creates recurring incidents because root causes are not documented
-
Prevents informed decisions about performance investments
-
Is lost when employees leave the organization
Concrete implications
-
All known performance limitations MUST be documented in the register
-
Quarterly review MUST take place: which debts were resolved, which were prioritized?
-
Every performance incident MUST create a register entry
-
New features MUST be evaluated for performance debt creation (ADR performance section)
Related Controls
-
WAF-PERF-100 – Performance Debt Register & Quarterly Review
-
WAF-PERF-050 – Performance Monitoring & SLO Definition