CI/CD 6 MIN READ 13 APR 2026

My CI/CD Pipeline: GitHub Actions Zero-Downtime

by Theodor

QUEST LOG ENTRY

WARNING · DRAGON AHEAD

Your deploy takes 15 minutes. You hold your breath. Tests fail. You revert. Friday night.

A real CI/CD pipeline is tight: lint in parallel, test in parallel, build once, push once, deploy with zero downtime. When it fails, it fails fast. When it succeeds, you’re proud.

Here’s the pipeline I actually use. Every company I’ve worked at, this works.

The Pipeline Stages

1. Lint & Format (parallel)
   ↓
2. Test (parallel, matrix strategy)
   ↓
3. Build Docker Image (once, cached layers)
   ↓
4. Push to Registry
   ↓
5. Deploy (pull image → health check → swap → old container stop)
   ↓
6. Notify on Success/Failure

Total time: 8-12 minutes. A single lint or test failure stops everything. One issue per failure message.

The GitHub Actions Workflow

Create .github/workflows/deploy.yml:

name: CI/CD

on:
  push:
    branches:
      - main
      - develop
  pull_request:
    branches:
      - main

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}
  NODE_VERSION: "20"

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup pnpm
        uses: pnpm/action-setup@v2
        with:
          version: 9

      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: ${{ env.NODE_VERSION }}
          cache: "pnpm"

      - name: Install dependencies
        run: pnpm install --frozen-lockfile

      - name: Lint
        run: pnpm lint

      - name: Format check
        run: pnpm format:check

  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        node-version: ["18", "20"]
    steps:
      - uses: actions/checkout@v4

      - name: Setup pnpm
        uses: pnpm/action-setup@v2
        with:
          version: 9

      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: ${{ matrix.node-version }}
          cache: "pnpm"

      - name: Install dependencies
        run: pnpm install --frozen-lockfile

      - name: Run tests
        run: pnpm test --coverage

      - name: Upload coverage
        uses: codecov/codecov-action@v3
        with:
          files: ./coverage/coverage-final.json

  build:
    needs: [lint, test]
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    outputs:
      image-tag: ${{ steps.meta.outputs.tags }}
    steps:
      - uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Log in to Container Registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=ref,event=branch
            type=semver,pattern={{version}}
            type=semver,pattern={{major}}.{{minor}}
            type=sha

      - name: Build and push Docker image
        uses: docker/build-push-action@v5
        with:
          context: .
          push: ${{ github.event_name == 'push' }}
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  deploy:
    needs: build
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v4

      - name: Deploy to production
        env:
          DEPLOY_KEY: ${{ secrets.DEPLOY_KEY }}
          DEPLOY_HOST: ${{ secrets.DEPLOY_HOST }}
          IMAGE_TAG: ${{ needs.build.outputs.image-tag }}
        run: |
          mkdir -p ~/.ssh
          echo "$DEPLOY_KEY" > ~/.ssh/deploy_key
          chmod 600 ~/.ssh/deploy_key
          ssh-keyscan -H $DEPLOY_HOST >> ~/.ssh/known_hosts

          ssh -i ~/.ssh/deploy_key deploy@$DEPLOY_HOST \
            "cd /app && ./deploy.sh $IMAGE_TAG"

      - name: Notify on failure
        if: failure()
        uses: actions/github-script@v7
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: 'Deploy failed. Check logs: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}'
            })

The Dockerfile: Multi-Stage Build

Layer caching is the secret. Install deps once. If code changes, skip the install step.

FROM node:20-alpine AS base
RUN npm install -g pnpm
WORKDIR /app

FROM base AS dependencies
COPY pnpm-lock.yaml .
COPY package.json .
RUN pnpm install --frozen-lockfile --prod

FROM base AS builder
COPY pnpm-lock.yaml .
COPY package.json .
RUN pnpm install --frozen-lockfile
COPY . .
RUN pnpm build

FROM base AS runtime
COPY --from=dependencies /app/node_modules ./node_modules
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package.json .

EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD node -e "require('http').get('http://localhost:3000/health', (r) => {if (r.statusCode !== 200) throw new Error(r.statusCode)})"

CMD ["node", "dist/server.js"]

Why this works:

✔ Dependencies layer cached across builds
✔ Build layer with dev deps doesn’t ship
✔ Final image ~200MB (node_modules + dist only)
✔ Health check catches startup failures

Zero-Downtime Deploy Script

On your server, ./deploy.sh:

#!/bin/bash
set -e

IMAGE_TAG=$1
REGISTRY=ghcr.io/yourorg/yourapp
CONTAINER_NAME=myapp
PORT=3000
HEALTH_URL=http://localhost:$PORT/health

# Pull new image
docker pull $REGISTRY:$IMAGE_TAG
echo "✔ Image pulled"

# Start new container on different port
docker run -d \
  --name ${CONTAINER_NAME}_new \
  -p 3001:3000 \
  -e DATABASE_URL=$DATABASE_URL \
  -e REDIS_URL=$REDIS_URL \
  $REGISTRY:$IMAGE_TAG

echo "✔ New container started on port 3001"

# Wait for health check
MAX_ATTEMPTS=30
ATTEMPT=0
while [ $ATTEMPT -lt $MAX_ATTEMPTS ]; do
  if curl -f http://localhost:3001/health >/dev/null 2>&1; then
    echo "✔ Health check passed"
    break
  fi
  ATTEMPT=$((ATTEMPT + 1))
  sleep 1
  if [ $ATTEMPT -eq $MAX_ATTEMPTS ]; then
    echo "✗ Health check failed. Cleaning up."
    docker stop ${CONTAINER_NAME}_new
    docker rm ${CONTAINER_NAME}_new
    exit 1
  fi
done

# Swap ports (or use load balancer / reverse proxy)
docker stop $CONTAINER_NAME || true
docker rm $CONTAINER_NAME || true
docker rename ${CONTAINER_NAME}_new $CONTAINER_NAME

echo "✔ Deploy complete"
docker ps --filter "name=$CONTAINER_NAME" --format "{{.Names}} {{.Status}}"

Health check is mandatory. Never flip traffic to a container you haven’t verified.

Handling Secrets

Environment variables in CI are not secrets. Secrets are encrypted.

- name: Deploy
  env:
    DATABASE_URL: ${{ secrets.DATABASE_URL }}
    REDIS_URL: ${{ secrets.REDIS_URL }}
  run: ./deploy.sh

GitHub encrypts secrets at rest and in logs. Good enough.

For sensitive operations (database migrations), require manual approval:

deploy:
  needs: build
  environment: production
  steps:
    - name: Deploy
      run: ./deploy.sh

Navigate to “Environments” in your repo settings, add reviewers. Deploy pauses, waits for approval.

Rollback Strategy

Keep the previous image tag. If something melts:

docker pull $REGISTRY:previous-stable
docker run -d --name myapp -p 3000:3000 $REGISTRY:previous-stable

Tag your images with semantic versioning: v1.2.3, v1.2.2-rc1, etc. Always know which version is live.

- name: Tag image
  run: |
    VERSION=$(git describe --tags --always)
    docker tag $IMAGE:latest $IMAGE:$VERSION
    docker push $IMAGE:$VERSION

Matrix Strategy: Test Multiple Node Versions

Your app runs on Node 18 and 20. Test both:

test:
  strategy:
    matrix:
      node-version: ["18", "20"]

Creates two parallel test jobs. Catches version-specific bugs early.

Parallel Execution

Lint and test run simultaneously. Only build and deploy wait for both to pass. Total time saved: the length of the longest (test or lint).

jobs:
  lint:
    runs-on: ubuntu-latest
    steps: [...]

  test:
    runs-on: ubuntu-latest
    steps: [...]

  build:
    needs: [lint, test]  # Wait for BOTH
    steps: [...]

  deploy:
    needs: build         # Wait for build only
    steps: [...]

Fast Feedback

The pipeline is tight but verbose. Each step logs what it’s doing. When it fails:

✗ Test failed: src/auth.test.ts line 42
  Expected "admin" to equal "user"

  Run: npm run test -- --grep "auth" to debug locally

No guessing. No 10-minute rebuild loops. You know the exact issue in seconds.

The Real Metrics

✔ Lint: 2 minutes
✔ Test: 4 minutes (parallel matrix)
✔ Build: 3 minutes (cached layers)
✔ Push: 1 minute
✔ Deploy: 2 minutes (health check)

Total: 10 minutes from push to live.

A single failure stops everything immediately. You fix it, push again, 10 minutes later you’re live. No manual deploys. No hoping it works. No Friday night fear.

That’s production engineering.

ALL POSTS →