WAF++ WAF++
Back to WAF++ Homepage

Glossary – Performance Efficiency

A

Auto-Scaling

Mechanism that automatically increases or decreases the number of compute resources based on defined metrics (CPU, request rate, queue depth).

Availability Zone (AZ)

Physically isolated data centers within a cloud region. For latency optimization, services that communicate frequently should be deployed in the same AZ.

B

Baseline

Measured performance reference of a system under defined load conditions. Foundation for regression testing and capacity planning.

Bulkhead Pattern

Isolation of resource pools (thread pools, connection pools) for different service categories, to prevent cascading failures.

Burst Balance

AWS-specific concept for gp2 EBS volumes: credits that accumulate at low I/O load and are consumed during load spikes. When exhausted, IOPS drop to baseline.

C

Cache Hit Rate

Percentage of requests that can be served from the cache, without querying the origin source (database, API). Target: >= 80% for application caches.

Cache Stampede / Thundering Herd

Phenomenon in which many parallel requests simultaneously attempt to regenerate an expired cache entry, causing massive load on the origin source.

Circuit Breaker

Software pattern that temporarily blocks further requests to a slow or failed downstream system, to prevent cascading failures.

Cold Start

Initialization delay for serverless functions or containers that have been idle for a long period. The first request after a longer idle phase is significantly slower than subsequent requests.

Connection Pool

A pre-maintained set of database connections reused by multiple threads/requests, to avoid the overhead of establishing new connections.

D

Distributed Cache

Cache layer outside the application process, typically Redis or Memcached, which can be shared by multiple instances.

E

Error Budget

SRE concept: the tolerable proportion of SLO violations within a defined time window. A service with a 99.9% availability SLO has 8.7 hours/year of error budget.

EBS gp3

Current generation of AWS General Purpose SSD volumes. Provides 3,000 IOPS and 125 MB/s baseline without burst mechanics, at 20% lower price than gp2.

F

Full Table Scan

Database operation in which all rows of a table must be read because no index exists for the query condition. Leads to high I/O and CPU load.

H

Horizontal Scaling

Increasing capacity by adding more identical instances behind a load balancer. Contrasts with vertical scaling (larger instance).

HPA (Horizontal Pod Autoscaler)

Kubernetes mechanism that automatically adjusts the number of pods in a deployment based on CPU utilization or custom metrics.

I

IOPS (Input/Output Operations Per Second)

Measurement for the speed of storage systems. Relevant for database performance and data-intensive workloads.

Index Strategy

Documented plan of which database columns/fields are indexed, to speed up frequent queries without creating unnecessary write overhead.

L

Latency

The time a single request requires from receipt to complete response. Typically measured in percentiles: P50 (median), P95, P99, P99.9.

Load Balancer

Component that distributes incoming requests across multiple backend instances, to distribute load evenly and avoid single points of failure.

Load Testing

Systematic verification of system behavior under defined, realistic load. Used to validate SLOs and auto-scaling configurations.

P

P50/P95/P99/P99.9 (latency percentiles)

Statistical measures for latency distributions: P95 = 95% of all requests are faster than this value. P99 = 99% of all requests are faster. Tail latency (P99, P99.9) is critical for user experience.

Performance Debt

Consciously accepted or unconsciously created performance limitations in architecture and implementation, that must be documented, prioritized, and resolved.

Provisioned Concurrency

AWS Lambda feature that pre-initializes function instances and keeps them warm, to eliminate cold start latency. Billed even during inactivity.

R

Read Replica

Read-only copy of a database that can take over read requests, to offload the primary database server.

Reserved Concurrency

AWS Lambda feature that reserves a fixed portion of the account concurrency limit for a function, to both guarantee a minimum capacity and prevent overloading the account.

S

Service Level Agreement (SLA)

Contractually agreed performance guarantee between a service provider and customer. Basis: SLOs + escalation/compensation rules.

Service Level Indicator (SLI)

Measurable quantity that quantifies the actually experienced service quality. Examples: P99 latency, success rate, availability.

Service Level Objective (SLO)

Internal target for an SLI. Example: P99 latency < 500ms, measured over 30 days. SLOs are the foundation for error budget management.

Slow Query Log

Database feature that logs SQL queries that exceed a defined execution time. Fundamental tool for database performance analysis.

SLO Burn Rate

Rate at which the error budget is consumed. A burn rate > 1 means the budget is being consumed faster than allowed.

Stress Testing

Load test with loads significantly above the expected maximum (typically 2x–5x), to identify capacity limits, failure modes, and system behavior at the limit.

T

Throughput

Number of processed requests or amount of data per unit of time. Typical unit: Requests per Second (RPS/TPS) or MB/s.

TTL (Time-to-Live)

Lifetime of a cache entry. After expiry, the entry is removed from the cache and reloaded on the next request.

V

Vertical Scaling

Increasing capacity by upgrading to a larger instance. Has a hard upper limit; typically requires downtime.

VPC Endpoint

AWS feature that allows cloud service APIs (S3, DynamoDB, SSM, etc.) to be reached via private AWS backbone connections, without passing through the internet.

VPC Peering

Direct network connection between two VPCs that routes traffic over the AWS internal network instead of the internet.

W

Write-Through Cache

Caching strategy in which write operations synchronously write to both the cache and the data source, to ensure cache consistency.