WAF++ WAF++
Back to WAF++ Homepage

WAF-REL-020 – Health Checks & Readiness Probes Configured

Description

All production services MUST expose health check endpoints and configure readiness/liveness probes. Load balancers MUST use health checks with explicitly configured paths, intervals and thresholds – no cloud provider defaults.

No deployment without working health checks. This is a non-negotiable prerequisite for automatic failover and zero-downtime deployments.

Rationale

Without health checks, faulty instances continue to receive traffic. Kubernetes without a readiness probe sends traffic to pods that are not yet ready or have already failed. Load balancers without explicit health check configuration use defaults that are often too tolerant (30s interval, no path validation).

Threat Context

Risk Description

Traffic to Faulty Instances

Without health check, LB continues routing requests to instances returning errors.

Deadlock Undetected

Without liveness probe, a deadlocked process runs forever and blocks resources.

Premature Traffic

Without readiness probe, an uninitialized pod receives traffic and produces errors.

Default Timeout Too Tolerant

Cloud provider defaults are often 30s interval and 3 failures – too long for fast recovery.

Requirement

All services MUST:

  • Expose /health/live endpoint (liveness: process is alive)

  • Expose /health/ready endpoint (readiness: traffic-capable, dependencies OK)

  • Kubernetes: configure readinessProbe and livenessProbe with measured initialDelaySeconds

  • Load balancer health checks: explicit path, interval, timeout, healthy/unhealthy threshold

  • No cloud provider defaults for health check configuration

Implementation Guidance

  1. Implement endpoints: /health/live (process only), /health/ready (check deps)

  2. Measure initialDelaySeconds: Measure service startup time, add buffer

  3. Configure intervals: interval=15s, timeout=5s, failureThreshold=3

  4. ALB health check: path=/health/ready, interval=15, matcher=200

  5. Liveness ONLY for process liveness: Do not check external dependencies in liveness probe

  6. Test: Simulate health check failure in staging and observe behavior

Maturity Levels

Level Name Criteria

1

No Health Checks

No probes; LB uses TCP ping as health check.

2

Basic LB Health Check

ALB health check on "/" configured; no Kubernetes probes.

3

ReadinessProbe + LivenessProbe

Both probes with measured delays; LB checks /health/ready; failures generate alerts.

4

Deep Health Checks

Readiness checks real dependencies; StartupProbe for slow services.

5

Synthetic Monitoring

External validation of health endpoints; health check latency as SLI.

Terraform Checks

waf-rel-020.tf.aws.alb-target-group-health-check

Checks: ALB Target Group has explicit health_check block with path, interval and thresholds.

Compliant Non-Compliant
resource "aws_lb_target_group" "api" {
  name     = "payment-api-tg"
  port     = 8080
  protocol = "HTTP"
  vpc_id   = var.vpc_id

  health_check {
    enabled           = true
    path              = "/health/ready"
    interval          = 15
    timeout           = 5
    healthy_threshold = 2
    unhealthy_threshold = 3
    matcher           = "200"
  }
}
resource "aws_lb_target_group" "api" {
  name     = "payment-api-tg"
  port     = 8080
  protocol = "HTTP"
  vpc_id   = var.vpc_id
  # No health_check block –
  # cloud defaults are used
  # WAF-REL-020 Violation
}

Remediation: Add health_check block with explicit path, interval, timeout, healthy_threshold and unhealthy_threshold.

Evidence

Type Required Description

IaC

✅ Required

Terraform or Kubernetes manifests with readiness and liveness probe configuration.

Config

✅ Required

Load balancer health check configuration with explicit path and thresholds.

Process

Optional

Test results: health check failure simulated and documented in staging.