Performance Principles

The 7 performance principles of WAF++ are guidelines for technical decisions. They describe the why behind the controls and enable informed judgments in situations that no control directly addresses.

PP1 – Measure First

Tagline: Without measurement, optimization is guesswork.

Explanation

Performance optimization without measurement is one of the most common forms of resource waste in software development. Developers optimize code paths that were never measured and miss the actual bottlenecks. The first principle of WAF++ is: measure before optimizing.

Measuring means:

Collecting baselines: Document P50, P95, P99 latency, throughput, and error rate before every optimization
Identifying bottlenecks: Which layer is the actual constraint? CPU? Network? Database? Cache?
Defining success: What result counts as an improvement? Without a target, no progress can be measured
Continuous measurement: Collect metrics permanently, not only during incidents

Concrete implications

Before every sizing decision: collect 2 weeks of utilization data
Before every caching implementation: measure cache miss rate and query frequency
Before every index optimization: run EXPLAIN ANALYZE on the actual queries
Before every auto-scaling tuning: measure the load profile and derive scaling thresholds

Related Controls

WAF-PERF-050 – Performance Monitoring & SLO Definition
WAF-PERF-040 – Database Performance Baseline & Index Strategy

PP2 – Right Technology for the Job

Tagline: The best technology for a use case is not the most powerful – it’s the most suitable.

Explanation

Every technology decision has performance implications. A relational database for a key-value store is just as wrong as a NoSQL document store for complex transactions. The principle: choose the technology that is sufficiently powerful and optimally suited for the specific use case – not the theoretically fastest one.

Concrete implications

For variable load: evaluate serverless before classic compute (WAF-PERF-080)
For caching: Redis for structured data, CDN for content, in-process for read-heavy lookups
For storage: gp3 for general compute, io2 for high-load databases (WAF-PERF-090)
For global distribution: evaluate CDN before multi-region deployment (WAF-PERF-070)

Related Controls

WAF-PERF-010 – Compute Instance Type & Sizing Validated
WAF-PERF-080 – Serverless & Managed Services for Variable Load

PP3 – Scale Horizontally

Tagline: More identical instances beats a single larger instance.

Explanation

Vertical scaling (larger instance) has limits: eventually there is no larger instance. It also creates single points of failure, has downtime during resizing, and provides no redundancy. Horizontal scaling (more instances behind a load balancer) is elastic, has no hard upper limit, and provides natural redundancy.

Concrete implications

Stateless architecture is a prerequisite for horizontal scaling
Session state MUST be externalized (Redis, DynamoDB) – not kept in-process
Database scaling: read replicas for read scaling, sharding for write scaling
Stateful services (databases, message queues): use provider-level auto-scaling

Related Controls

WAF-PERF-020 – Auto-Scaling Configured & Tested
WAF-PERF-080 – Serverless & Managed Services for Variable Load

PP4 – Cache Strategically

Tagline: Don’t cache everything – but cache everything that repeats.

Explanation

Caching is one of the most effective performance optimizations when applied correctly. Applied incorrectly, it creates hard-to-diagnose stale data problems. The principle: cache consciously and with documentation – with clear rules for candidates, TTLs, and invalidation.

Caching candidates:

Always cache: Static assets, immutable configuration, external API responses with tolerance
Cache with short TTL: Aggregated metrics, product catalogs, price lists
Do not cache: User-specific transaction data, real-time information, security decisions

Concrete implications

Caching strategy MUST be documented with data type, layer, TTL, and invalidation logic
Cache hit rate MUST be measured; target: >= 80% application cache, >= 95% CDN
Cache invalidation on data mutations is mandatory – no "we just won’t cache it"
Thundering herd protection for cache expiry events on high-frequency keys

Related Controls

WAF-PERF-030 – Caching Strategy Defined & Implemented
WAF-PERF-070 – Network Latency & Topology Optimization

PP5 – Test Under Load

Tagline: An untested scaling path is not a scaling path.

Explanation

Every architecture behaves differently under load than at idle. Connection pools saturate, auto-scaling triggers are delayed, database query plans change with data volume. The principle: no production-critical system goes to production without load test validation.

Load tests reveal:

Auto-scaling configuration errors (thresholds too high, cooldowns too long)
Connection pool exhaustion under concurrent load
Database performance degradation from query plan changes under load
Memory leaks and garbage collection pauses under extended load

Concrete implications

Load tests are a deployment gate: production deployment without a passed load test is a violation
Acceptance criteria MUST be defined before the load test: P95 < X ms, error rate < Y%
Baseline for regression comparison MUST be updated after every successful test
Stress test (2x, 5x expected load) at least quarterly for critical services

Related Controls

WAF-PERF-060 – Load & Stress Testing in CI/CD Pipeline
WAF-PERF-020 – Auto-Scaling Configured & Tested

PP6 – Automate Scaling

Tagline: Manual capacity management is a 24/7 operational problem.

Explanation

Manual scaling requires human intervention at any time of day or night, is error-prone, slow, and creates response time gaps during unexpected traffic spikes. Automated scaling with validated configurations is not only more efficient – it is more reliable than any person in an on-call rotation.

Concrete implications

All stateless production workloads MUST have auto-scaling configured
Scaling metrics MUST be based on application behavior (latency, request rate, queue depth)
Scaling thresholds MUST be derived from load test data – not intuition
Scale-in MUST be more conservative than scale-out (cooldown: scale-out shorter than scale-in)

Related Controls

WAF-PERF-020 – Auto-Scaling Configured & Tested
WAF-PERF-080 – Serverless & Managed Services for Variable Load

PP7 – Performance as Architecture Concern (Document Performance Debt)

Tagline: Known performance limitations are debt – and debt grows with interest.

Explanation

Performance debt arises when architectural decisions or resource constraints lead to known performance limitations that are consciously accepted. This debt is rational – but only if it is documented and regularly reviewed.

Undocumented performance debt:

Is forgotten and accumulates unnoticed
Creates recurring incidents because root causes are not documented
Prevents informed decisions about performance investments
Is lost when employees leave the organization

Concrete implications

All known performance limitations MUST be documented in the register
Quarterly review MUST take place: which debts were resolved, which were prioritized?
Every performance incident MUST create a register entry
New features MUST be evaluated for performance debt creation (ADR performance section)

Related Controls

WAF-PERF-100 – Performance Debt Register & Quarterly Review
WAF-PERF-050 – Performance Monitoring & SLO Definition