WAF-PERF-020 – Auto-Scaling Configured & Tested

Pillar: Performance Efficiency | Severity: High | Category: Auto-Scaling | Automatable: High

Description

All stateless production workloads with variable or unpredictable traffic MUST have auto-scaling configured. Auto-scaling policies MUST be based on meaningful metrics (request latency, request rate, queue depth). Auto-scaling configurations MUST be tested under realistic load before going into production.

No deployment to production without a validated scaling path.

Rationale

Static capacity creates an unsolvable dilemma: over-provisioning for peaks (expensive) or under-provisioning and degradation under unexpected load (risky). Auto-scaling solves this dilemma – but only if correctly configured. Wrong thresholds, missing cooldowns, or the absence of instance warmup configuration leads to scaling failure at the critical moment.

Any auto-scaling configuration that has never been tested under load is effectively nonexistent.

Threat Context

Risk	Description
Capacity Bottleneck During Load Spike	Non-scaling service degrades or fails during traffic peaks.
Scaling Oscillation	Incorrectly configured cooldowns lead to constant scale-out/scale-in (cost + instability).
Scaling Too Late	Threshold too high or scaling metric wrong → scaling triggers only after SLO is already violated.
Cold-Start Latency	Instance warmup missing → new instances are immediately hit with full traffic before they are ready.

Risk

Description

Capacity Bottleneck During Load Spike

Non-scaling service degrades or fails during traffic peaks.

Scaling Oscillation

Incorrectly configured cooldowns lead to constant scale-out/scale-in (cost + instability).

Scaling Too Late

Threshold too high or scaling metric wrong → scaling triggers only after SLO is already violated.

Cold-Start Latency

Instance warmup missing → new instances are immediately hit with full traffic before they are ready.

Requirement

All stateless production workloads MUST have auto-scaling configured (min >= 1, max >= 2)
Scaling metrics MUST be based on application behavior, not just CPU
Auto-scaling MUST be validated through load testing (evidence: test report)
Scale-in cooldown MUST be >= scale-out cooldown (conservative scale-in)

Implementation Guidance

Choose scaling metric: ALB request count (HTTP APIs), queue depth (workers), custom metrics (special workloads)
Derive thresholds from load test: Which requests/s value leads to P95 latency of SLO/2?
Configure min/max: min >= 2 for redundancy; max = 3–5x expected normal load
Configure cooldowns: Scale-out 60s, scale-in 300s (conservative)
Configure instance warmup: 60–120s, so new instances are not immediately overloaded
Run load test: Gradual load profile to 2x peak; validate auto-scaling trigger
Configure monitoring: Alert when desired capacity >= 80% max capacity

Maturity Levels

Level	Name	Criteria
1	Static Capacity	No auto-scaling; manual adjustment during load spikes; typically uncritically over-provisioned.
2	Configured, Not Validated	ASG/VMSS configured; default CPU threshold; never tested under load.
3	Validated with Load Test	Correct metrics; load test validation; documented limits; health check type configured.
4	Predictive & Event-Driven	Predictive scaling; queue-based scaling; scale-out duration measured within SLO.
5	Autonomous Capacity Management	Fully automated; ML-based policies; SLO breach prediction before occurrence.

Level

Name

Criteria

Static Capacity

No auto-scaling; manual adjustment during load spikes; typically uncritically over-provisioned.

Configured, Not Validated

ASG/VMSS configured; default CPU threshold; never tested under load.

Validated with Load Test

Correct metrics; load test validation; documented limits; health check type configured.

Predictive & Event-Driven

Predictive scaling; queue-based scaling; scale-out duration measured within SLO.

Autonomous Capacity Management

Fully automated; ML-based policies; SLO breach prediction before occurrence.

Terraform Checks

waf-perf-020.tf.aws.autoscaling-group-policy

Checks: AWS Auto Scaling Groups must have min >= 1, max >= 2 and health check type.

Compliant Non-Compliant

Compliant	Non-Compliant
`resource "aws_autoscaling_group" "api" { min_size = 2 max_size = 10 desired_capacity = 2 health_check_type = "ELB" } resource "aws_autoscaling_policy" "scale" { policy_type = "TargetTrackingScaling" target_tracking_configuration { predefined_metric_specification { predefined_metric_type = "ALBRequestCountPerTarget" } target_value = 1000.0 } }`	`resource "aws_autoscaling_group" "api" { min_size = 1 max_size = 1 # max=min=1 – no scaling # WAF-PERF-020 Violation }`

resource "aws_autoscaling_group" "api" {
  min_size          = 2
  max_size          = 10
  desired_capacity  = 2
  health_check_type = "ELB"
}
resource "aws_autoscaling_policy" "scale" {
  policy_type = "TargetTrackingScaling"
  target_tracking_configuration {
    predefined_metric_specification {
      predefined_metric_type = "ALBRequestCountPerTarget"
    }
    target_value = 1000.0
  }
}

resource "aws_autoscaling_group" "api" {
  min_size = 1
  max_size = 1
  # max=min=1 – no scaling
  # WAF-PERF-020 Violation
}

Remediation: min_size >= 2 for production redundancy, max_size >= 3 for scaling capability. Add scaling policy (TargetTracking or StepScaling). Set health check type to ELB.

Evidence

Type	Required	Description
IaC	✅ Required	Auto-scaling configuration with min/max and scaling policy.
Process	✅ Required	Load test results demonstrating that scaling triggers within the latency SLO.
Config	Optional	CloudWatch/Azure Monitor/GCP alerts for scaling events configured.
Governance	Optional	Runbook with documented scaling limits and known bottlenecks.

Type

Required

Description

IaC

✅ Required

Auto-scaling configuration with min/max and scaling policy.

Process

✅ Required

Load test results demonstrating that scaling triggers within the latency SLO.

Config

Optional

CloudWatch/Azure Monitor/GCP alerts for scaling events configured.

Governance

Optional

Runbook with documented scaling limits and known bottlenecks.

Related Controls

Regulatorisches Mapping

Framework	Controls
ISO/IEC 25010:2011	8.3.2 – Performance efficiency; 8.3.2.1 – Time behaviour; 8.3.2.2 – Resource utilisation; 8.3.2.3 – Capacity
AWS Well-Architected Framework	Performance Efficiency Pillar – Select the right resource types and sizes
Azure Well-Architected Framework	Performance Efficiency – Choose the right resources
Google Cloud Architecture Framework	Performance optimization – Right-size your instances
TOGAF 10	ADM Phase B – Business architecture; ADM Phase C – Application architecture
DORA	DORA 2024 – Technical practices; DORA 2024 – Performance monitoring
ISO/IEC 29119	4.4.3 – Test design techniques; 4.5.3 – Test execution
ISO/IEC 12207	8.2.2.3 – Design and development of software
ITIL 4	SVS – Service value system; DP – Design principle
BSI C5:2020	OPS-01 – Operational monitoring; OPS-02 – Operational control
CIS Controls v8	CIS 8 – Continuous Vulnerability Management
NIST SP 800-53	RA-1 – Security assessment policy; RA-2 – Security assessment controls
NIST CSF 2.0	DE.CM – Continuous monitoring; DE.AE – Anomaly detection
FedRAMP	RA-2, RA-5 (Moderate/High baseline)
SOC 2 Type II	CC6.1 – Logical access security software; CC7.1 – Infrastructure and software monitoring
TISAX	Information security – Performance monitoring
ANSSI SecNumCloud	Domain – Performance monitoring
BIO	BIO – Prestatiedoelstellingen
ENS High	op.exp.2 – Configuración de seguridad
UK NCSC CAF	B4 – System security; B5 – System performance
CMMC 2.0	RA.L2-3.8.1 – Automated monitoring
IRAP	ISM – Performance monitoring
CCCS PBMM	RA-2 – Security assessment controls; RA-5 – Security assessments
MAS TRM	Ch.5 – Technology risk governance
ISMAP	Performance monitoring and validation
FISC	Technical measures – Performance monitoring

Framework

Controls

ISO/IEC 25010:2011

8.3.2 – Performance efficiency; 8.3.2.1 – Time behaviour; 8.3.2.2 – Resource utilisation; 8.3.2.3 – Capacity

AWS Well-Architected Framework

Performance Efficiency Pillar – Select the right resource types and sizes

Azure Well-Architected Framework

Performance Efficiency – Choose the right resources

Google Cloud Architecture Framework

Performance optimization – Right-size your instances

TOGAF 10

ADM Phase B – Business architecture; ADM Phase C – Application architecture

DORA

DORA 2024 – Technical practices; DORA 2024 – Performance monitoring

ISO/IEC 29119

4.4.3 – Test design techniques; 4.5.3 – Test execution

ISO/IEC 12207

8.2.2.3 – Design and development of software

ITIL 4

SVS – Service value system; DP – Design principle

BSI C5:2020

OPS-01 – Operational monitoring; OPS-02 – Operational control

CIS Controls v8

CIS 8 – Continuous Vulnerability Management

NIST SP 800-53

RA-1 – Security assessment policy; RA-2 – Security assessment controls

NIST CSF 2.0

DE.CM – Continuous monitoring; DE.AE – Anomaly detection

FedRAMP

RA-2, RA-5 (Moderate/High baseline)

SOC 2 Type II

CC6.1 – Logical access security software; CC7.1 – Infrastructure and software monitoring

TISAX

Information security – Performance monitoring

ANSSI SecNumCloud

Domain – Performance monitoring

BIO

BIO – Prestatiedoelstellingen

ENS High

op.exp.2 – Configuración de seguridad

UK NCSC CAF

B4 – System security; B5 – System performance

CMMC 2.0

RA.L2-3.8.1 – Automated monitoring

IRAP

ISM – Performance monitoring

CCCS PBMM

RA-2 – Security assessment controls; RA-5 – Security assessments

MAS TRM

Ch.5 – Technology risk governance

ISMAP

Performance monitoring and validation

FISC

Technical measures – Performance monitoring

WAF-PERF-020 – Auto-Scaling Configured & Tested

Description

Rationale

Threat Context

Requirement

Implementation Guidance

Maturity Levels

Terraform Checks

waf-perf-020.tf.aws.autoscaling-group-policy

Evidence

Related Controls

Regulatorisches Mapping

Best Practice