WAF++ WAF++
Back to WAF++ Homepage

Best Practice: Building and Securing a CI/CD Pipeline

Context

A CI/CD pipeline is the foundation of every Operational Excellence initiative. Without automated deployments, all other OpsEx measures are more effort-intensive, error-prone, and harder to scale.

This best practice describes: Building a production-ready pipeline from pipeline definition through approval gates, branch protection, and artifact versioning.

Target State

A production-ready CI/CD pipeline:

  • Is fully defined as code (YAML, HCL) and stored in version control

  • Runs on every pull request (tests, linting, security scans)

  • Deploys automatically to staging after merge to main

  • Requires manual approval for production deployments

  • Uses versioned, immutable artifacts

  • Has a deployment freeze mechanism for critical periods

Technical Implementation

Step 1: Define the Pipeline Structure

# .github/workflows/ci-cd.yml (GitHub Actions)
name: CI/CD Pipeline

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  AWS_REGION: eu-central-1
  ECR_REPOSITORY: payment-service

jobs:
  # ===== STAGE 1: Lint & Test =====
  test:
    name: Lint, Test & Security Scan
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Lint
        run: make lint

      - name: Unit Tests
        run: make test-unit

      - name: Security Scan (Trivy)
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: 'fs'
          security-checks: 'vuln,secret'
          exit-code: '1'
          severity: 'HIGH,CRITICAL'

  # ===== STAGE 2: Build & Publish =====
  build:
    name: Build & Push Container Image
    runs-on: ubuntu-latest
    needs: test
    if: github.ref == 'refs/heads/main'
    outputs:
      image-tag: ${{ steps.meta.outputs.tags }}
      image-digest: ${{ steps.build.outputs.digest }}
    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Login to ECR
        id: ecr-login
        uses: aws-actions/amazon-ecr-login@v2

      - name: Docker Meta (tag with Git SHA)
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ steps.ecr-login.outputs.registry }}/${{ env.ECR_REPOSITORY }}
          tags: |
            type=sha,prefix=sha-,format=short

      - name: Build and Push
        id: build
        uses: docker/build-push-action@v5
        with:
          push: true
          tags: ${{ steps.meta.outputs.tags }}

  # ===== STAGE 3: Deploy to Staging =====
  deploy-staging:
    name: Deploy to Staging
    runs-on: ubuntu-latest
    needs: build
    environment: staging
    steps:
      - name: Deploy to ECS Staging
        run: |
          aws ecs update-service \
            --cluster payment-staging \
            --service payment-service \
            --force-new-deployment

      - name: Wait for Stable Deployment
        run: |
          aws ecs wait services-stable \
            --cluster payment-staging \
            --services payment-service

      - name: Smoke Test
        run: make smoke-test ENVIRONMENT=staging

  # ===== STAGE 4: Deploy to Production (Manual Gate) =====
  deploy-production:
    name: Deploy to Production
    runs-on: ubuntu-latest
    needs: deploy-staging
    environment: production  # environment protection rules = approval gate
    steps:
      - name: Deploy to ECS Production (Canary)
        run: |
          aws deploy create-deployment \
            --application-name payment-service \
            --deployment-group-name production \
            --container-name payment-service \
            --container-port 8080 \
            --image "${{ needs.build.outputs.image-tag }}"

Step 2: Configure Branch Protection

// GitHub Branch Protection (via Terraform)
resource "github_branch_protection" "main" {
  repository_id = github_repository.app.node_id
  pattern       = "main"

  required_status_checks {
    strict   = true
    contexts = ["Lint, Test & Security Scan"]
  }

  required_pull_request_reviews {
    required_approving_review_count = 1
    require_code_owner_reviews      = true
    dismiss_stale_reviews           = true
  }

  enforce_admins = true
  allows_force_pushes = false
  allows_deletions    = false
}

Step 3: Define CODEOWNERS

# CODEOWNERS – each line: path owners
# Changes to critical paths require review from the listed teams

# Terraform Infrastructure
/infrastructure/  @platform-team @security-team

# CI/CD Pipelines
/.github/workflows/ @platform-team

# Application Code
/src/ @payment-team

# Security-sensitive configuration
/infrastructure/security/ @security-team

Step 4: Configure a Deployment Freeze

# GitHub Actions: Deployment Freeze via Environment Protection
# In GitHub Settings > Environments > production:
# - Required reviewers: @tech-leads
# - Deployment branches: main only
# - Wait timer: 0 minutes (reviewer requirement only)

# Alternative: programmatic freeze check in the pipeline
- name: Check Deployment Freeze
  run: |
    FREEZE_START=$(date -d "2025-12-20" +%s)
    FREEZE_END=$(date -d "2026-01-05" +%s)
    NOW=$(date +%s)
    if [ "$NOW" -ge "$FREEZE_START" ] && [ "$NOW" -le "$FREEZE_END" ]; then
      echo "DEPLOYMENT FREEZE ACTIVE until 2026-01-05"
      echo "Emergency deployments require CTO approval: ops-emergency@company.com"
      exit 1
    fi

Common Anti-Patterns

Anti-Pattern Problem

latest tag in production

Not reproducible; every new build changes the deployed image without a deployment

Secrets in pipeline code

GitHub Actions logs are visible; secrets in env: block get logged

No branch protection for main

Engineers push directly; no review; no tests

Pipeline without security scan

Vulnerabilities are only discovered in production

Manual deployment steps after the build

"Semi-automated" is not automated; the first manual step negates all automation gains

Pipeline YAML not in repository

"Deployment scripts" on a server – no review, no version history

Metrics

Measure these metrics to evaluate pipeline maturity:

  • Deployment Frequency: How often is deployed to production? (target: daily+)

  • Lead Time for Changes: Commit to production? (target: < 1 hour for hotfix, < 1 day for feature)

  • Pipeline throughput time: Total pipeline duration (target: < 15 minutes)

  • Pipeline success rate: % of pipelines that are green (target: > 90%)

Maturity Levels

Level Characteristics

Level 1

Deployments via SSH/console. No pipeline definition. No automated tests.

Level 2

CI pipeline exists. Tests run automatically. Deployment scripts exist but are executed manually.

Level 3

Full CI/CD. Branch protection. Approval gate for production. Artifacts versioned.

Level 4

Deployment metrics measured. Canary/Blue-Green deployments. Automatic rollback.

Level 5

DORA Elite: multiple times daily. Change Failure Rate < 5%. Continuous Deployment possible.