Best Practice: Retention Strategy
Context
"Storing things barely costs anything" – this statement is the most common trigger of infinite retention cost debt. Storage is cheaper per GB than compute. But storage costs grow linearly and without limit, while compute costs are controllable through rightsizing and shutdown.
Without lifecycle policies, S3 buckets, log groups, and snapshots accumulate over years and become an invisible but significant cost burden.
Related Controls
-
WAF-COST-040 – Storage & Retention Lifecycle Defined
-
WAF-COST-070 – Observability & Logging Cost Tiers
Target State
-
No S3 bucket, no log group, no Azure Storage, no GCS bucket without a lifecycle policy
-
Log tiering: Hot (7–30d), Warm (30–90d), Cold (90–365d), Archive (>365d)
-
No DEBUG-level logging in production without explicit justification
-
Retention strategy documented and versioned
Log Retention Tiering
The 4-Tier Model
| Tier | Retention | Type | Costs |
|---|---|---|---|
Hot |
0–30 days |
Operational Logs (Errors, Warnings, App Logs) |
Highest costs/GB – minimal volume |
Warm |
30–90 days |
Security Logs, Audit Trails, Access Logs |
Medium costs – compliance-relevant data |
Cold |
90–365 days |
Regulatory Logs, Financial Logs, GDPR-relevant Logs |
Cheap – rarely queried |
Archive |
> 365 days |
Legal Hold, Long-Term Compliance |
Very cheap – Glacier/Cool/Coldline |
Configure CloudWatch Logs Retention
# Compliant: retention explicitly set (30 days = hot tier)
resource "aws_cloudwatch_log_group" "application" {
name = "/app/${var.environment}/application"
retention_in_days = 30 # Hot Tier: Operational Logs
kms_key_id = aws_kms_key.logging.arn
tags = merge(module.mandatory_tags.tags, {
log-tier = "hot"
log-type = "operational"
})
}
resource "aws_cloudwatch_log_group" "audit" {
name = "/app/${var.environment}/audit"
retention_in_days = 365 # Cold/Archive: audit logs (regulatory)
kms_key_id = aws_kms_key.logging.arn
tags = merge(module.mandatory_tags.tags, {
log-tier = "cold"
log-type = "audit"
})
}
# Non-Compliant: no retention set (= unlimited)
resource "aws_cloudwatch_log_group" "application" {
name = "/app/production/application"
# retention_in_days not set = 0 = unlimited
# WAF-COST-040 and WAF-COST-070 Violation
}
S3 Lifecycle Policies
Standard Lifecycle for Data Buckets
resource "aws_s3_bucket_lifecycle_configuration" "data_lifecycle" {
bucket = aws_s3_bucket.application_data.id
rule {
id = "transition-to-ia"
status = "Enabled"
filter {
prefix = "data/"
}
transition {
days = 30
storage_class = "STANDARD_IA" # After 30 days → Infrequent Access
}
transition {
days = 90
storage_class = "GLACIER_IR" # After 90 days → Glacier Instant Retrieval
}
transition {
days = 365
storage_class = "DEEP_ARCHIVE" # After 1 year → Glacier Deep Archive
}
}
rule {
id = "delete-temp-files"
status = "Enabled"
filter {
prefix = "tmp/"
}
expiration {
days = 7 # Delete temporary files after 7 days
}
}
rule {
id = "delete-old-versions"
status = "Enabled"
noncurrent_version_expiration {
noncurrent_days = 30 # Delete old versions after 30 days
}
noncurrent_version_transition {
noncurrent_days = 7
storage_class = "GLACIER_IR"
}
}
rule {
id = "abort-incomplete-uploads"
status = "Enabled"
abort_incomplete_multipart_upload {
days_after_initiation = 3 # Clean up incomplete uploads after 3 days
}
}
}
# Non-Compliant: no lifecycle defined
resource "aws_s3_bucket" "data" {
bucket = "acme-application-data"
# No lifecycle_configuration – WAF-COST-040 Violation
}
Lifecycle for Log Buckets
resource "aws_s3_bucket_lifecycle_configuration" "logs_lifecycle" {
bucket = aws_s3_bucket.application_logs.id
rule {
id = "log-tiering"
status = "Enabled"
transition {
days = 30
storage_class = "STANDARD_IA"
}
transition {
days = 90
storage_class = "GLACIER_IR"
}
expiration {
days = 365 # Delete operational logs after 1 year (adjust per compliance requirement)
}
}
rule {
id = "abort-incomplete"
status = "Enabled"
abort_incomplete_multipart_upload {
days_after_initiation = 1
}
}
}
Azure Storage Lifecycle
resource "azurerm_storage_management_policy" "lifecycle" {
storage_account_id = azurerm_storage_account.main.id
rule {
name = "tiering-rule"
enabled = true
filters {
blob_types = ["blockBlob"]
prefix_match = ["data/"]
}
actions {
base_blob {
tier_to_cool_after_days_since_modification_greater_than = 30
tier_to_archive_after_days_since_modification_greater_than = 90
delete_after_days_since_modification_greater_than = 365
}
snapshot {
delete_after_days_since_creation_greater_than = 30
}
}
}
}
GCP Cloud Storage Lifecycle
resource "google_storage_bucket" "data" {
name = "acme-data-${var.environment}"
location = var.gcp_region
force_destroy = false
lifecycle_rule {
condition {
age = 30
}
action {
type = "SetStorageClass"
storage_class = "NEARLINE"
}
}
lifecycle_rule {
condition {
age = 90
}
action {
type = "SetStorageClass"
storage_class = "COLDLINE"
}
}
lifecycle_rule {
condition {
age = 365
}
action {
type = "SetStorageClass"
storage_class = "ARCHIVE"
}
}
lifecycle_rule {
condition {
age = 30
with_state = "ARCHIVED"
}
action {
type = "Delete"
}
}
}
Snapshot Management
# AWS: Automated EBS snapshot management with AWS Backup
resource "aws_backup_plan" "main" {
name = "main-backup-plan"
rule {
rule_name = "daily-backup"
target_vault_name = aws_backup_vault.main.name
schedule = "cron(0 2 * * ? *)" # Daily at 2:00
lifecycle {
cold_storage_after = 30 # After 30 days → Cold Storage
delete_after = 90 # After 90 days → Delete
}
}
rule {
rule_name = "weekly-backup"
target_vault_name = aws_backup_vault.main.name
schedule = "cron(0 2 ? * SUN *)" # Sundays
lifecycle {
cold_storage_after = 60
delete_after = 365
}
}
}
Retention Strategy Document
# docs/retention-strategy.yml
version: "1.0"
effective_date: "2025-01-01"
log_retention:
operational_logs:
description: "Application logs (INFO, WARN, ERROR)"
hot_tier_days: 30
archive_after_days: null # No archive – delete after hot tier
regulatory_basis: "Operational requirement only"
security_audit_logs:
description: "CloudTrail, Security Group Changes, IAM Events"
hot_tier_days: 90
cold_tier_days: 365
archive_after_days: 2555 # 7 years – BSI C5 requirement
regulatory_basis: "BSI C5, ISO 27001"
application_access_logs:
description: "HTTP Access Logs, API Gateway Logs"
hot_tier_days: 30
cold_tier_days: 365
archive_after_days: null
regulatory_basis: "Internal policy"
storage_retention:
customer_data:
description: "Customer data (personal data)"
deletion_policy: "On account deletion + 30 days"
regulatory_basis: "GDPR Art. 17"
backups:
daily: 7
weekly: 4
monthly: 12
regulatory_basis: "Business continuity requirement"
Common Anti-Patterns
-
retention_in_days = 0 in CloudWatch: means unlimited, not "no log"
-
Forgotten buckets without lifecycle: especially in older accounts
-
Snapshots without expiry: grow unnoticed to TB scale
-
Compliance as a blanket justification: "we must keep everything" – usually not true in that generality