Chapter 16: Deployment and DevOps

16. Chapter 16: Deployment and DevOps#

16.1. Learning Objectives#

By the end of this chapter, you will understand:

Modern deployment strategies for Web GIS applications
Containerization with Docker for geospatial services
Orchestration with Kubernetes for scalable deployments
CI/CD pipelines for automated testing and deployment
Infrastructure as Code for reproducible environments
Monitoring and observability for production systems
Scaling strategies for high-traffic geospatial applications

16.2. Modern Deployment Architecture#

Web GIS applications require sophisticated deployment strategies that address the unique challenges of geospatial data processing, real-time updates, and global accessibility. Modern deployment architectures leverage containerization, orchestration, and automation to ensure reliable, scalable, and maintainable systems.

16.2.1. Understanding Deployment Complexity#

Geospatial Service Dependencies: Web GIS applications typically require multiple specialized services including PostGIS databases, tile servers, spatial processing engines, and real-time data ingestion pipelines. Each component has specific resource requirements, scaling characteristics, and operational dependencies that must be carefully orchestrated in production environments.

Geographic Distribution Requirements: Unlike traditional web applications, Web GIS systems often need to serve users globally while maintaining low latency for map interactions. This requires strategic placement of services, data replication across regions, and intelligent routing to ensure optimal performance regardless of user location.

Data Pipeline Complexity: Geospatial applications frequently involve complex data processing pipelines that transform raw geographic data into optimized formats for web delivery. These pipelines may include batch processing jobs, real-time streaming processors, and on-demand tile generation services that must be reliably coordinated in production.

Security and Compliance: Location data often contains sensitive information requiring careful handling according to various privacy regulations. Deployment architectures must implement proper data isolation, encryption, access controls, and audit logging while maintaining system performance and usability.

16.2.2. Cloud-Native Architecture Patterns#

Microservices for Spatial Functions: Breaking down Web GIS applications into focused microservices enables independent scaling, deployment, and maintenance of different functional areas. Spatial query services, tile generation, user management, and real-time processing can be developed, deployed, and scaled independently based on their specific requirements.

Event-Driven Architecture: Geospatial applications benefit significantly from event-driven patterns that enable real-time updates, asynchronous processing, and loose coupling between components. Location updates, data changes, and user interactions can trigger events that flow through the system, enabling responsive and scalable applications.

API Gateway Patterns: Centralized API gateways provide essential capabilities for Web GIS deployments including authentication, rate limiting, request routing, and protocol translation. They enable unified access to distributed spatial services while providing operational visibility and control.

16.3. Containerization with Docker#

16.3.1. Docker Configuration for GIS Services#

# Dockerfile for Node.js Web GIS Backend
FROM node:18-alpine AS base

# Install system dependencies for spatial operations
RUN apk add --no-cache \
    gdal-dev \
    geos-dev \
    proj-dev \
    sqlite-dev \
    postgresql-client \
    curl \
    && rm -rf /var/cache/apk/*

WORKDIR /app

# Copy package files for dependency installation
COPY package*.json ./
COPY yarn.lock ./

# Install dependencies
FROM base AS dependencies
RUN yarn install --frozen-lockfile --production=false

# Build stage
FROM dependencies AS build
COPY . .
RUN yarn build
RUN yarn install --frozen-lockfile --production=true

# Production stage
FROM base AS production
COPY --from=build /app/dist ./dist
COPY --from=build /app/node_modules ./node_modules
COPY --from=build /app/package.json ./package.json

# Create non-root user for security
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nodejs -u 1001
USER nodejs

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:3000/health || exit 1

EXPOSE 3000

CMD ["node", "dist/server.js"]

# Dockerfile for PostGIS Database
FROM postgis/postgis:15-3.3-alpine

# Install additional extensions
RUN apk add --no-cache \
    postgresql-contrib \
    && rm -rf /var/cache/apk/*

# Copy initialization scripts
COPY ./docker/postgis/init/ /docker-entrypoint-initdb.d/

# Environment variables
ENV POSTGRES_DB=webgis
ENV POSTGRES_USER=webgis
ENV POSTGRES_PASSWORD=secure_password
ENV POSTGIS_ENABLE_OUTDB_RASTERS=1
ENV POSTGIS_GDAL_ENABLED_DRIVERS="ENABLE_ALL"

# Custom postgresql.conf for spatial workloads
COPY ./docker/postgis/postgresql.conf /etc/postgresql/postgresql.conf

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
    CMD pg_isready -U $POSTGRES_USER -d $POSTGRES_DB || exit 1

EXPOSE 5432

# Dockerfile for Redis Cache
FROM redis:7-alpine

# Copy custom Redis configuration
COPY ./docker/redis/redis.conf /etc/redis/redis.conf

# Create directory for Redis data
RUN mkdir -p /data && chown redis:redis /data

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
    CMD redis-cli ping || exit 1

EXPOSE 6379

CMD ["redis-server", "/etc/redis/redis.conf"]

# Dockerfile for Frontend Application
FROM node:18-alpine AS build

WORKDIR /app

# Install dependencies
COPY package*.json ./
RUN npm ci --only=production

# Copy source code and build
COPY . .
RUN npm run build

# Production stage with nginx
FROM nginx:alpine

# Copy built application
COPY --from=build /app/dist /usr/share/nginx/html

# Copy nginx configuration
COPY ./docker/nginx/nginx.conf /etc/nginx/nginx.conf
COPY ./docker/nginx/default.conf /etc/nginx/conf.d/default.conf

# Create nginx user and set permissions
RUN adduser -D -S -h /var/cache/nginx -s /sbin/nologin -G nginx nginx
RUN chown -R nginx:nginx /usr/share/nginx/html

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:80/health || exit 1

EXPOSE 80

CMD ["nginx", "-g", "daemon off;"]

16.3.2. Docker Compose for Development#

# docker-compose.yml
version: '3.8'

services:
  # Database
  postgis:
    build:
      context: .
      dockerfile: docker/postgis/Dockerfile
    container_name: webgis-postgis
    environment:
      POSTGRES_DB: webgis
      POSTGRES_USER: webgis
      POSTGRES_PASSWORD: secure_password
    volumes:
      - postgis_data:/var/lib/postgresql/data
      - ./docker/postgis/init:/docker-entrypoint-initdb.d
    ports:
      - "5432:5432"
    networks:
      - webgis-network
    restart: unless-stopped

  # Cache
  redis:
    build:
      context: .
      dockerfile: docker/redis/Dockerfile
    container_name: webgis-redis
    volumes:
      - redis_data:/data
    ports:
      - "6379:6379"
    networks:
      - webgis-network
    restart: unless-stopped

  # Backend API
  api:
    build:
      context: .
      dockerfile: docker/api/Dockerfile
    container_name: webgis-api
    environment:
      NODE_ENV: development
      DATABASE_URL: postgresql://webgis:secure_password@postgis:5432/webgis
      REDIS_URL: redis://redis:6379
      JWT_SECRET: your-jwt-secret
      PORT: 3000
    volumes:
      - ./src:/app/src
      - ./uploads:/app/uploads
    ports:
      - "3000:3000"
    depends_on:
      - postgis
      - redis
    networks:
      - webgis-network
    restart: unless-stopped

  # Frontend
  frontend:
    build:
      context: .
      dockerfile: docker/frontend/Dockerfile
    container_name: webgis-frontend
    environment:
      REACT_APP_API_URL: http://localhost:3000/api
      REACT_APP_MAPBOX_TOKEN: your-mapbox-token
    ports:
      - "80:80"
    depends_on:
      - api
    networks:
      - webgis-network
    restart: unless-stopped

  # Tile Server
  tile-server:
    build:
      context: .
      dockerfile: docker/tiles/Dockerfile
    container_name: webgis-tiles
    environment:
      DATABASE_URL: postgresql://webgis:secure_password@postgis:5432/webgis
      CACHE_URL: redis://redis:6379
      TILE_SIZE: 512
      MAX_ZOOM: 18
    volumes:
      - tile_cache:/app/cache
    ports:
      - "8080:8080"
    depends_on:
      - postgis
      - redis
    networks:
      - webgis-network
    restart: unless-stopped

  # Background Workers
  worker:
    build:
      context: .
      dockerfile: docker/worker/Dockerfile
    container_name: webgis-worker
    environment:
      NODE_ENV: development
      DATABASE_URL: postgresql://webgis:secure_password@postgis:5432/webgis
      REDIS_URL: redis://redis:6379
      WORKER_CONCURRENCY: 4
    volumes:
      - ./data:/app/data
    depends_on:
      - postgis
      - redis
    networks:
      - webgis-network
    restart: unless-stopped

  # Monitoring
  prometheus:
    image: prom/prometheus:latest
    container_name: webgis-prometheus
    volumes:
      - ./docker/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    ports:
      - "9090:9090"
    networks:
      - webgis-network
    restart: unless-stopped

  grafana:
    image: grafana/grafana:latest
    container_name: webgis-grafana
    environment:
      GF_SECURITY_ADMIN_PASSWORD: admin
    volumes:
      - grafana_data:/var/lib/grafana
      - ./docker/grafana/dashboards:/etc/grafana/provisioning/dashboards
      - ./docker/grafana/datasources:/etc/grafana/provisioning/datasources
    ports:
      - "3001:3000"
    depends_on:
      - prometheus
    networks:
      - webgis-network
    restart: unless-stopped

volumes:
  postgis_data:
  redis_data:
  tile_cache:
  prometheus_data:
  grafana_data:

networks:
  webgis-network:
    driver: bridge

16.3.3. Production Docker Configurations#

# docker-compose.prod.yml
version: '3.8'

services:
  # Production PostgreSQL with optimizations
  postgis:
    build:
      context: .
      dockerfile: docker/postgis/Dockerfile.prod
    environment:
      POSTGRES_DB: ${POSTGRES_DB}
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - postgis_data:/var/lib/postgresql/data
      - ./docker/postgis/production.conf:/etc/postgresql/postgresql.conf
    deploy:
      replicas: 1
      resources:
        limits:
          memory: 4G
          cpus: '2'
        reservations:
          memory: 2G
          cpus: '1'
      restart_policy:
        condition: on-failure
        delay: 10s
        max_attempts: 3
    networks:
      - webgis-backend
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s

  # Production Redis with clustering
  redis:
    image: redis:7-alpine
    command: redis-server --appendonly yes --cluster-enabled yes
    volumes:
      - redis_data:/data
    deploy:
      replicas: 3
      resources:
        limits:
          memory: 1G
          cpus: '0.5'
      restart_policy:
        condition: on-failure
    networks:
      - webgis-backend

  # Load-balanced API services
  api:
    build:
      context: .
      dockerfile: docker/api/Dockerfile.prod
    environment:
      NODE_ENV: production
      DATABASE_URL: ${DATABASE_URL}
      REDIS_URL: ${REDIS_URL}
      JWT_SECRET: ${JWT_SECRET}
    deploy:
      replicas: 3
      resources:
        limits:
          memory: 1G
          cpus: '1'
      restart_policy:
        condition: on-failure
      update_config:
        parallelism: 1
        delay: 10s
        order: start-first
    networks:
      - webgis-backend
      - webgis-frontend
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  # Production frontend with CDN
  frontend:
    build:
      context: .
      dockerfile: docker/frontend/Dockerfile.prod
    deploy:
      replicas: 2
      resources:
        limits:
          memory: 512M
          cpus: '0.5'
      restart_policy:
        condition: on-failure
    networks:
      - webgis-frontend

  # Load balancer
  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./docker/nginx/production.conf:/etc/nginx/nginx.conf
      - ./ssl:/etc/ssl/certs
    deploy:
      replicas: 2
      resources:
        limits:
          memory: 256M
          cpus: '0.5'
    networks:
      - webgis-frontend
    depends_on:
      - api
      - frontend

volumes:
  postgis_data:
    driver: local
    driver_opts:
      type: nfs
      o: addr=${NFS_SERVER},rw
      device: ":${NFS_PATH}/postgis"
  redis_data:
    driver: local

networks:
  webgis-frontend:
    driver: overlay
    attachable: true
  webgis-backend:
    driver: overlay
    internal: true

16.4. Kubernetes Deployment#

16.4.1. Kubernetes Manifests#

# k8s/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: webgis
  labels:
    name: webgis
    environment: production
---
# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: webgis-config
  namespace: webgis
data:
  DATABASE_HOST: "postgis-service"
  DATABASE_PORT: "5432"
  DATABASE_NAME: "webgis"
  REDIS_HOST: "redis-service"
  REDIS_PORT: "6379"
  NODE_ENV: "production"
  LOG_LEVEL: "info"
  TILE_SIZE: "512"
  MAX_ZOOM: "18"
---
# k8s/secrets.yaml
apiVersion: v1
kind: Secret
metadata:
  name: webgis-secrets
  namespace: webgis
type: Opaque
data:
  DATABASE_PASSWORD: <base64-encoded-password>
  JWT_SECRET: <base64-encoded-jwt-secret>
  REDIS_PASSWORD: <base64-encoded-redis-password>
  MAPBOX_TOKEN: <base64-encoded-mapbox-token>

# k8s/postgis-deployment.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgis
  namespace: webgis
spec:
  serviceName: postgis-service
  replicas: 1
  selector:
    matchLabels:
      app: postgis
  template:
    metadata:
      labels:
        app: postgis
    spec:
      containers:
      - name: postgis
        image: webgis/postgis:latest
        env:
        - name: POSTGRES_DB
          valueFrom:
            configMapKeyRef:
              name: webgis-config
              key: DATABASE_NAME
        - name: POSTGRES_USER
          value: "webgis"
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: webgis-secrets
              key: DATABASE_PASSWORD
        ports:
        - containerPort: 5432
        volumeMounts:
        - name: postgis-storage
          mountPath: /var/lib/postgresql/data
        resources:
          requests:
            memory: "2Gi"
            cpu: "1"
          limits:
            memory: "4Gi"
            cpu: "2"
        livenessProbe:
          exec:
            command:
            - pg_isready
            - -U
            - webgis
            - -d
            - webgis
          initialDelaySeconds: 60
          periodSeconds: 30
        readinessProbe:
          exec:
            command:
            - pg_isready
            - -U
            - webgis
            - -d
            - webgis
          initialDelaySeconds: 30
          periodSeconds: 10
  volumeClaimTemplates:
  - metadata:
      name: postgis-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: "fast-ssd"
      resources:
        requests:
          storage: 100Gi
---
apiVersion: v1
kind: Service
metadata:
  name: postgis-service
  namespace: webgis
spec:
  selector:
    app: postgis
  ports:
  - port: 5432
    targetPort: 5432
  type: ClusterIP

# k8s/redis-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
  namespace: webgis
spec:
  replicas: 3
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - name: redis
        image: redis:7-alpine
        command: ["redis-server"]
        args: ["--appendonly", "yes", "--cluster-enabled", "yes"]
        ports:
        - containerPort: 6379
        - containerPort: 16379
        volumeMounts:
        - name: redis-storage
          mountPath: /data
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
        livenessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 5
          periodSeconds: 5
      volumes:
      - name: redis-storage
        emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
  name: redis-service
  namespace: webgis
spec:
  selector:
    app: redis
  ports:
  - port: 6379
    targetPort: 6379
  type: ClusterIP

# k8s/api-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webgis-api
  namespace: webgis
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  selector:
    matchLabels:
      app: webgis-api
  template:
    metadata:
      labels:
        app: webgis-api
    spec:
      containers:
      - name: api
        image: webgis/api:latest
        envFrom:
        - configMapRef:
            name: webgis-config
        env:
        - name: DATABASE_PASSWORD
          valueFrom:
            secretKeyRef:
              name: webgis-secrets
              key: DATABASE_PASSWORD
        - name: JWT_SECRET
          valueFrom:
            secretKeyRef:
              name: webgis-secrets
              key: JWT_SECRET
        ports:
        - containerPort: 3000
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /ready
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 10
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 15"]
---
apiVersion: v1
kind: Service
metadata:
  name: webgis-api-service
  namespace: webgis
spec:
  selector:
    app: webgis-api
  ports:
  - port: 3000
    targetPort: 3000
  type: ClusterIP

# k8s/frontend-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webgis-frontend
  namespace: webgis
spec:
  replicas: 2
  selector:
    matchLabels:
      app: webgis-frontend
  template:
    metadata:
      labels:
        app: webgis-frontend
    spec:
      containers:
      - name: frontend
        image: webgis/frontend:latest
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: "256Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "250m"
        livenessProbe:
          httpGet:
            path: /health
            port: 80
          initialDelaySeconds: 30
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /health
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: webgis-frontend-service
  namespace: webgis
spec:
  selector:
    app: webgis-frontend
  ports:
  - port: 80
    targetPort: 80
  type: ClusterIP

# k8s/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: webgis-ingress
  namespace: webgis
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/use-regex: "true"
    nginx.ingress.kubernetes.io/rewrite-target: /$2
    nginx.ingress.kubernetes.io/rate-limit-rps: "100"
    nginx.ingress.kubernetes.io/rate-limit-connections: "10"
spec:
  tls:
  - hosts:
    - webgis.example.com
    secretName: webgis-tls
  rules:
  - host: webgis.example.com
    http:
      paths:
      - path: /api(/|$)(.*)
        pathType: Prefix
        backend:
          service:
            name: webgis-api-service
            port:
              number: 3000
      - path: /
        pathType: Prefix
        backend:
          service:
            name: webgis-frontend-service
            port:
              number: 80

16.4.2. Horizontal Pod Autoscaler#

# k8s/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webgis-api-hpa
  namespace: webgis
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webgis-api
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "1000"
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 60
      - type: Pods
        value: 4
        periodSeconds: 60
      selectPolicy: Max
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webgis-frontend-hpa
  namespace: webgis
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webgis-frontend
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 70

16.5. CI/CD Pipelines#

16.5.1. GitHub Actions Workflow#

# .github/workflows/ci-cd.yml
name: CI/CD Pipeline

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgis/postgis:15-3.3
        env:
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: webgis_test
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 5432:5432
      
      redis:
        image: redis:7-alpine
        options: >-
          --health-cmd "redis-cli ping"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 6379:6379

    steps:
    - name: Checkout code
      uses: actions/checkout@v4

    - name: Setup Node.js
      uses: actions/setup-node@v4
      with:
        node-version: '18'
        cache: 'npm'

    - name: Install dependencies
      run: npm ci

    - name: Run linting
      run: npm run lint

    - name: Run type checking
      run: npm run type-check

    - name: Run unit tests
      run: npm run test:unit
      env:
        DATABASE_URL: postgresql://postgres:postgres@localhost:5432/webgis_test
        REDIS_URL: redis://localhost:6379

    - name: Run integration tests
      run: npm run test:integration
      env:
        DATABASE_URL: postgresql://postgres:postgres@localhost:5432/webgis_test
        REDIS_URL: redis://localhost:6379

    - name: Setup Playwright
      uses: microsoft/playwright-github-action@v1

    - name: Run E2E tests
      run: npm run test:e2e
      env:
        DATABASE_URL: postgresql://postgres:postgres@localhost:5432/webgis_test
        REDIS_URL: redis://localhost:6379

    - name: Upload test results
      uses: actions/upload-artifact@v3
      if: always()
      with:
        name: test-results
        path: |
          coverage/
          test-results/
          playwright-report/

  security:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout code
      uses: actions/checkout@v4

    - name: Run Trivy vulnerability scanner
      uses: aquasecurity/trivy-action@master
      with:
        scan-type: 'fs'
        format: 'sarif'
        output: 'trivy-results.sarif'

    - name: Upload Trivy scan results
      uses: github/codeql-action/upload-sarif@v2
      with:
        sarif_file: 'trivy-results.sarif'

    - name: Run npm audit
      run: npm audit --audit-level moderate

  build:
    needs: [test, security]
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    strategy:
      matrix:
        component: [api, frontend, tiles, worker]

    steps:
    - name: Checkout code
      uses: actions/checkout@v4

    - name: Log in to Container Registry
      uses: docker/login-action@v2
      with:
        registry: ${{ env.REGISTRY }}
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}

    - name: Extract metadata
      id: meta
      uses: docker/metadata-action@v4
      with:
        images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/${{ matrix.component }}
        tags: |
          type=ref,event=branch
          type=ref,event=pr
          type=sha,prefix={{branch}}-
          type=raw,value=latest,enable={{is_default_branch}}

    - name: Build and push Docker image
      uses: docker/build-push-action@v4
      with:
        context: .
        file: ./docker/${{ matrix.component }}/Dockerfile
        push: true
        tags: ${{ steps.meta.outputs.tags }}
        labels: ${{ steps.meta.outputs.labels }}
        cache-from: type=gha
        cache-to: type=gha,mode=max

  deploy-staging:
    needs: build
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/develop'
    environment:
      name: staging
      url: https://staging.webgis.example.com

    steps:
    - name: Checkout code
      uses: actions/checkout@v4

    - name: Configure kubectl
      uses: azure/k8s-set-context@v3
      with:
        method: kubeconfig
        kubeconfig: ${{ secrets.KUBE_CONFIG_STAGING }}

    - name: Deploy to staging
      run: |
        # Update image tags in manifests
        sed -i "s|webgis/api:latest|${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/api:develop|g" k8s/api-deployment.yaml
        sed -i "s|webgis/frontend:latest|${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/frontend:develop|g" k8s/frontend-deployment.yaml
        
        # Apply manifests
        kubectl apply -f k8s/ -n webgis-staging
        
        # Wait for rollout
        kubectl rollout status deployment/webgis-api -n webgis-staging --timeout=600s
        kubectl rollout status deployment/webgis-frontend -n webgis-staging --timeout=600s

    - name: Run smoke tests
      run: |
        # Wait for service to be ready
        sleep 30
        
        # Run basic health checks
        curl -f https://staging.webgis.example.com/health || exit 1
        curl -f https://staging.webgis.example.com/api/health || exit 1

  deploy-production:
    needs: [build, deploy-staging]
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    environment:
      name: production
      url: https://webgis.example.com

    steps:
    - name: Checkout code
      uses: actions/checkout@v4

    - name: Configure kubectl
      uses: azure/k8s-set-context@v3
      with:
        method: kubeconfig
        kubeconfig: ${{ secrets.KUBE_CONFIG_PRODUCTION }}

    - name: Blue-Green Deployment
      run: |
        # Create new deployment with green suffix
        sed -i "s|webgis-api|webgis-api-green|g" k8s/api-deployment.yaml
        sed -i "s|webgis-frontend|webgis-frontend-green|g" k8s/frontend-deployment.yaml
        sed -i "s|webgis/api:latest|${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/api:main|g" k8s/api-deployment.yaml
        sed -i "s|webgis/frontend:latest|${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/frontend:main|g" k8s/frontend-deployment.yaml
        
        # Deploy green version
        kubectl apply -f k8s/api-deployment.yaml -n webgis
        kubectl apply -f k8s/frontend-deployment.yaml -n webgis
        
        # Wait for green deployment
        kubectl rollout status deployment/webgis-api-green -n webgis --timeout=600s
        kubectl rollout status deployment/webgis-frontend-green -n webgis --timeout=600s

    - name: Health Check Green Deployment
      run: |
        # Test green deployment
        GREEN_API_POD=$(kubectl get pods -l app=webgis-api-green -n webgis -o jsonpath='{.items[0].metadata.name}')
        kubectl port-forward $GREEN_API_POD 8080:3000 -n webgis &
        sleep 10
        
        curl -f http://localhost:8080/health || exit 1
        kill %1

    - name: Switch Traffic to Green
      run: |
        # Update service selectors to point to green
        kubectl patch service webgis-api-service -n webgis -p '{"spec":{"selector":{"app":"webgis-api-green"}}}'
        kubectl patch service webgis-frontend-service -n webgis -p '{"spec":{"selector":{"app":"webgis-frontend-green"}}}'

    - name: Cleanup Blue Deployment
      run: |
        # Remove old blue deployment
        kubectl delete deployment webgis-api -n webgis --ignore-not-found=true
        kubectl delete deployment webgis-frontend -n webgis --ignore-not-found=true
        
        # Rename green to blue for next deployment
        kubectl patch deployment webgis-api-green -n webgis -p '{"metadata":{"name":"webgis-api"}}'
        kubectl patch deployment webgis-frontend-green -n webgis -p '{"metadata":{"name":"webgis-frontend"}}'

    - name: Post-deployment Tests
      run: |
        # Run comprehensive post-deployment tests
        sleep 60
        curl -f https://webgis.example.com/health || exit 1
        curl -f https://webgis.example.com/api/health || exit 1
        
        # Run critical user journey tests
        npm run test:smoke -- --environment=production

  notify:
    needs: [deploy-production]
    runs-on: ubuntu-latest
    if: always()
    steps:
    - name: Notify Slack
      uses: 8398a7/action-slack@v3
      with:
        status: ${{ job.status }}
        channel: '#deployments'
        webhook_url: ${{ secrets.SLACK_WEBHOOK }}
        fields: repo,message,commit,author,action,eventName,ref,workflow

16.5.2. GitLab CI/CD Pipeline#

# .gitlab-ci.yml
variables:
  DOCKER_DRIVER: overlay2
  DOCKER_TLS_CERTDIR: "/certs"
  POSTGRES_DB: webgis_test
  POSTGRES_USER: postgres
  POSTGRES_PASSWORD: postgres
  REDIS_URL: redis://redis:6379
  DATABASE_URL: postgresql://postgres:postgres@postgres:5432/webgis_test

stages:
  - test
  - security
  - build
  - deploy-staging
  - deploy-production

# Test stage
test:unit:
  stage: test
  image: node:18-alpine
  services:
    - name: postgis/postgis:15-3.3
      alias: postgres
      variables:
        POSTGRES_DB: $POSTGRES_DB
        POSTGRES_USER: $POSTGRES_USER
        POSTGRES_PASSWORD: $POSTGRES_PASSWORD
    - name: redis:7-alpine
      alias: redis
  before_script:
    - apk add --no-cache curl
    - npm ci
  script:
    - npm run lint
    - npm run type-check
    - npm run test:unit
    - npm run test:integration
  coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
  artifacts:
    reports:
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml
      junit: junit.xml
    paths:
      - coverage/
    expire_in: 1 week

test:e2e:
  stage: test
  image: mcr.microsoft.com/playwright:v1.40.0-focal
  services:
    - name: postgis/postgis:15-3.3
      alias: postgres
      variables:
        POSTGRES_DB: $POSTGRES_DB
        POSTGRES_USER: $POSTGRES_USER
        POSTGRES_PASSWORD: $POSTGRES_PASSWORD
    - name: redis:7-alpine
      alias: redis
  before_script:
    - npm ci
    - npx playwright install
  script:
    - npm run test:e2e
  artifacts:
    when: always
    paths:
      - playwright-report/
      - test-results/
    expire_in: 1 week

# Security stage
security:scan:
  stage: security
  image: aquasec/trivy:latest
  script:
    - trivy fs --exit-code 1 --severity HIGH,CRITICAL .
  allow_failure: true

security:dependencies:
  stage: security
  image: node:18-alpine
  script:
    - npm audit --audit-level moderate
  allow_failure: true

# Build stage
.build_template: &build_template
  stage: build
  image: docker:24-dind
  services:
    - docker:24-dind
  before_script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
  script:
    - docker build -f docker/$COMPONENT/Dockerfile -t $CI_REGISTRY_IMAGE/$COMPONENT:$CI_COMMIT_SHA .
    - docker push $CI_REGISTRY_IMAGE/$COMPONENT:$CI_COMMIT_SHA
    - |
      if [ "$CI_COMMIT_BRANCH" = "main" ]; then
        docker tag $CI_REGISTRY_IMAGE/$COMPONENT:$CI_COMMIT_SHA $CI_REGISTRY_IMAGE/$COMPONENT:latest
        docker push $CI_REGISTRY_IMAGE/$COMPONENT:latest
      fi

build:api:
  <<: *build_template
  variables:
    COMPONENT: api

build:frontend:
  <<: *build_template
  variables:
    COMPONENT: frontend

build:tiles:
  <<: *build_template
  variables:
    COMPONENT: tiles

build:worker:
  <<: *build_template
  variables:
    COMPONENT: worker

# Staging deployment
deploy:staging:
  stage: deploy-staging
  image: bitnami/kubectl:latest
  environment:
    name: staging
    url: https://staging.webgis.example.com
  before_script:
    - echo $KUBE_CONFIG_STAGING | base64 -d > kubeconfig
    - export KUBECONFIG=kubeconfig
  script:
    - sed -i "s|webgis/api:latest|$CI_REGISTRY_IMAGE/api:$CI_COMMIT_SHA|g" k8s/api-deployment.yaml
    - sed -i "s|webgis/frontend:latest|$CI_REGISTRY_IMAGE/frontend:$CI_COMMIT_SHA|g" k8s/frontend-deployment.yaml
    - kubectl apply -f k8s/ -n webgis-staging
    - kubectl rollout status deployment/webgis-api -n webgis-staging --timeout=600s
    - kubectl rollout status deployment/webgis-frontend -n webgis-staging --timeout=600s
  only:
    - develop

# Production deployment
deploy:production:
  stage: deploy-production
  image: bitnami/kubectl:latest
  environment:
    name: production
    url: https://webgis.example.com
  before_script:
    - echo $KUBE_CONFIG_PRODUCTION | base64 -d > kubeconfig
    - export KUBECONFIG=kubeconfig
  script:
    - sed -i "s|webgis/api:latest|$CI_REGISTRY_IMAGE/api:$CI_COMMIT_SHA|g" k8s/api-deployment.yaml
    - sed -i "s|webgis/frontend:latest|$CI_REGISTRY_IMAGE/frontend:$CI_COMMIT_SHA|g" k8s/frontend-deployment.yaml
    - kubectl apply -f k8s/ -n webgis
    - kubectl rollout status deployment/webgis-api -n webgis --timeout=600s
    - kubectl rollout status deployment/webgis-frontend -n webgis --timeout=600s
  when: manual
  only:
    - main

16.6. Infrastructure as Code#

16.6.1. Terraform Configuration#

# infrastructure/main.tf
terraform {
  required_version = ">= 1.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "~> 2.23"
    }
  }
  
  backend "s3" {
    bucket = "webgis-terraform-state"
    key    = "infrastructure/terraform.tfstate"
    region = "us-west-2"
  }
}

provider "aws" {
  region = var.aws_region
}

# VPC and Networking
module "vpc" {
  source = "terraform-aws-modules/vpc/aws"
  
  name = "${var.project_name}-vpc"
  cidr = "10.0.0.0/16"
  
  azs             = ["${var.aws_region}a", "${var.aws_region}b", "${var.aws_region}c"]
  private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets  = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
  
  enable_nat_gateway = true
  enable_vpn_gateway = false
  enable_dns_hostnames = true
  enable_dns_support = true
  
  tags = {
    Project = var.project_name
    Environment = var.environment
  }
}

# EKS Cluster
module "eks" {
  source = "terraform-aws-modules/eks/aws"
  
  cluster_name    = "${var.project_name}-cluster"
  cluster_version = "1.28"
  
  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets
  
  cluster_endpoint_private_access = true
  cluster_endpoint_public_access  = true
  
  cluster_addons = {
    coredns = {
      most_recent = true
    }
    kube-proxy = {
      most_recent = true
    }
    vpc-cni = {
      most_recent = true
    }
    aws-ebs-csi-driver = {
      most_recent = true
    }
  }
  
  eks_managed_node_groups = {
    general = {
      desired_size = 3
      max_size     = 10
      min_size     = 3
      
      instance_types = ["t3.large"]
      capacity_type  = "ON_DEMAND"
      
      k8s_labels = {
        Environment = var.environment
        NodeGroup   = "general"
      }
    }
    
    spatial = {
      desired_size = 2
      max_size     = 8
      min_size     = 2
      
      instance_types = ["m5.xlarge"]
      capacity_type  = "SPOT"
      
      k8s_labels = {
        Environment = var.environment
        NodeGroup   = "spatial"
        Workload    = "spatial-processing"
      }
      
      taints = [
        {
          key    = "spatial-workload"
          value  = "true"
          effect = "NO_SCHEDULE"
        }
      ]
    }
  }
  
  tags = {
    Project = var.project_name
    Environment = var.environment
  }
}

# RDS PostgreSQL with PostGIS
resource "aws_db_subnet_group" "main" {
  name       = "${var.project_name}-db-subnet-group"
  subnet_ids = module.vpc.private_subnets
  
  tags = {
    Name = "${var.project_name} DB subnet group"
  }
}

resource "aws_security_group" "rds" {
  name_prefix = "${var.project_name}-rds"
  vpc_id      = module.vpc.vpc_id
  
  ingress {
    from_port       = 5432
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [module.eks.node_security_group_id]
  }
  
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  tags = {
    Name = "${var.project_name}-rds-sg"
  }
}

resource "aws_db_instance" "main" {
  identifier = "${var.project_name}-db"
  
  engine         = "postgres"
  engine_version = "15.4"
  instance_class = "db.r5.xlarge"
  
  allocated_storage     = 100
  max_allocated_storage = 1000
  storage_type         = "gp3"
  storage_encrypted    = true
  
  db_name  = "webgis"
  username = "webgis"
  password = var.db_password
  
  vpc_security_group_ids = [aws_security_group.rds.id]
  db_subnet_group_name   = aws_db_subnet_group.main.name
  
  backup_retention_period = 7
  backup_window          = "03:00-04:00"
  maintenance_window     = "sun:04:00-sun:05:00"
  
  skip_final_snapshot = false
  final_snapshot_identifier = "${var.project_name}-db-final-snapshot"
  
  performance_insights_enabled = true
  monitoring_interval         = 60
  monitoring_role_arn        = aws_iam_role.rds_monitoring.arn
  
  tags = {
    Name = "${var.project_name}-database"
    Environment = var.environment
  }
}

# ElastiCache Redis
resource "aws_elasticache_subnet_group" "main" {
  name       = "${var.project_name}-cache-subnet"
  subnet_ids = module.vpc.private_subnets
}

resource "aws_security_group" "redis" {
  name_prefix = "${var.project_name}-redis"
  vpc_id      = module.vpc.vpc_id
  
  ingress {
    from_port       = 6379
    to_port         = 6379
    protocol        = "tcp"
    security_groups = [module.eks.node_security_group_id]
  }
  
  tags = {
    Name = "${var.project_name}-redis-sg"
  }
}

resource "aws_elasticache_replication_group" "main" {
  replication_group_id         = "${var.project_name}-redis"
  description                  = "Redis cluster for ${var.project_name}"
  
  port                = 6379
  parameter_group_name = "default.redis7"
  node_type           = "cache.r5.large"
  
  num_cache_clusters = 3
  
  subnet_group_name  = aws_elasticache_subnet_group.main.name
  security_group_ids = [aws_security_group.redis.id]
  
  at_rest_encryption_enabled = true
  transit_encryption_enabled = true
  
  tags = {
    Name = "${var.project_name}-redis"
    Environment = var.environment
  }
}

# S3 Buckets
resource "aws_s3_bucket" "assets" {
  bucket = "${var.project_name}-assets-${random_id.bucket_suffix.hex}"
  
  tags = {
    Name = "${var.project_name} Assets"
    Environment = var.environment
  }
}

resource "aws_s3_bucket_versioning" "assets" {
  bucket = aws_s3_bucket.assets.id
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "assets" {
  bucket = aws_s3_bucket.assets.id
  
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

resource "aws_s3_bucket" "backups" {
  bucket = "${var.project_name}-backups-${random_id.bucket_suffix.hex}"
  
  tags = {
    Name = "${var.project_name} Backups"
    Environment = var.environment
  }
}

resource "aws_s3_bucket_lifecycle_configuration" "backups" {
  bucket = aws_s3_bucket.backups.id
  
  rule {
    id     = "backup_lifecycle"
    status = "Enabled"
    
    expiration {
      days = 90
    }
    
    noncurrent_version_expiration {
      noncurrent_days = 30
    }
  }
}

# CloudFront Distribution
resource "aws_cloudfront_distribution" "main" {
  origin {
    domain_name = aws_lb.main.dns_name
    origin_id   = "ALB-${var.project_name}"
    
    custom_origin_config {
      http_port              = 80
      https_port             = 443
      origin_protocol_policy = "https-only"
      origin_ssl_protocols   = ["TLSv1.2"]
    }
  }
  
  enabled             = true
  is_ipv6_enabled     = true
  default_root_object = "index.html"
  
  default_cache_behavior {
    allowed_methods        = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
    cached_methods         = ["GET", "HEAD"]
    target_origin_id       = "ALB-${var.project_name}"
    compress               = true
    viewer_protocol_policy = "redirect-to-https"
    
    forwarded_values {
      query_string = true
      cookies {
        forward = "none"
      }
    }
    
    min_ttl     = 0
    default_ttl = 3600
    max_ttl     = 86400
  }
  
  # Cache behavior for API endpoints
  ordered_cache_behavior {
    path_pattern           = "/api/*"
    allowed_methods        = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
    cached_methods         = ["GET", "HEAD"]
    target_origin_id       = "ALB-${var.project_name}"
    compress               = true
    viewer_protocol_policy = "redirect-to-https"
    
    forwarded_values {
      query_string = true
      headers      = ["Authorization", "CloudFront-Forwarded-Proto"]
      cookies {
        forward = "all"
      }
    }
    
    min_ttl     = 0
    default_ttl = 0
    max_ttl     = 0
  }
  
  # Cache behavior for tiles
  ordered_cache_behavior {
    path_pattern           = "/tiles/*"
    allowed_methods        = ["GET", "HEAD"]
    cached_methods         = ["GET", "HEAD"]
    target_origin_id       = "ALB-${var.project_name}"
    compress               = true
    viewer_protocol_policy = "redirect-to-https"
    
    forwarded_values {
      query_string = false
      cookies {
        forward = "none"
      }
    }
    
    min_ttl     = 86400
    default_ttl = 86400
    max_ttl     = 31536000
  }
  
  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }
  
  viewer_certificate {
    acm_certificate_arn      = aws_acm_certificate.main.arn
    ssl_support_method       = "sni-only"
    minimum_protocol_version = "TLSv1.2_2021"
  }
  
  tags = {
    Name = "${var.project_name}-cloudfront"
    Environment = var.environment
  }
}

# Application Load Balancer
resource "aws_lb" "main" {
  name               = "${var.project_name}-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb.id]
  subnets            = module.vpc.public_subnets
  
  enable_deletion_protection = false
  
  tags = {
    Name = "${var.project_name}-alb"
    Environment = var.environment
  }
}

resource "aws_security_group" "alb" {
  name_prefix = "${var.project_name}-alb"
  vpc_id      = module.vpc.vpc_id
  
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  tags = {
    Name = "${var.project_name}-alb-sg"
  }
}

# IAM Roles
resource "aws_iam_role" "rds_monitoring" {
  name = "${var.project_name}-rds-monitoring-role"
  
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "monitoring.rds.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "rds_monitoring" {
  role       = aws_iam_role.rds_monitoring.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonRDSEnhancedMonitoringRole"
}

# Random ID for unique resource names
resource "random_id" "bucket_suffix" {
  byte_length = 4
}

# Variables
variable "project_name" {
  description = "Name of the project"
  type        = string
  default     = "webgis"
}

variable "environment" {
  description = "Environment name"
  type        = string
  default     = "production"
}

variable "aws_region" {
  description = "AWS region"
  type        = string
  default     = "us-west-2"
}

variable "db_password" {
  description = "Database password"
  type        = string
  sensitive   = true
}

# Outputs
output "cluster_endpoint" {
  description = "EKS cluster endpoint"
  value       = module.eks.cluster_endpoint
}

output "cluster_name" {
  description = "EKS cluster name"
  value       = module.eks.cluster_name
}

output "database_endpoint" {
  description = "RDS instance endpoint"
  value       = aws_db_instance.main.endpoint
}

output "redis_endpoint" {
  description = "Redis cluster endpoint"
  value       = aws_elasticache_replication_group.main.primary_endpoint_address
}

output "cloudfront_domain" {
  description = "CloudFront distribution domain"
  value       = aws_cloudfront_distribution.main.domain_name
}

16.7. Monitoring and Observability#

16.7.1. Prometheus and Grafana Setup#

# monitoring/prometheus-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: monitoring
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      evaluation_interval: 15s

    rule_files:
      - "/etc/prometheus/rules/*.yml"

    alerting:
      alertmanagers:
        - static_configs:
            - targets:
              - alertmanager:9093

    scrape_configs:
      - job_name: 'prometheus'
        static_configs:
          - targets: ['localhost:9090']

      - job_name: 'webgis-api'
        kubernetes_sd_configs:
          - role: endpoints
            namespaces:
              names:
                - webgis
        relabel_configs:
          - source_labels: [__meta_kubernetes_service_name]
            action: keep
            regex: webgis-api-service
          - source_labels: [__meta_kubernetes_endpoint_port_name]
            action: keep
            regex: metrics

      - job_name: 'webgis-frontend'
        kubernetes_sd_configs:
          - role: endpoints
            namespaces:
              names:
                - webgis
        relabel_configs:
          - source_labels: [__meta_kubernetes_service_name]
            action: keep
            regex: webgis-frontend-service

      - job_name: 'postgres-exporter'
        static_configs:
          - targets: ['postgres-exporter:9187']

      - job_name: 'redis-exporter'
        static_configs:
          - targets: ['redis-exporter:9121']

      - job_name: 'node-exporter'
        kubernetes_sd_configs:
          - role: node
        relabel_configs:
          - source_labels: [__address__]
            regex: '(.*):10250'
            target_label: __address__
            replacement: '${1}:9100'

  alert_rules.yml: |
    groups:
      - name: webgis.rules
        rules:
          - alert: HighErrorRate
            expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
            for: 5m
            labels:
              severity: critical
            annotations:
              summary: "High error rate detected"
              description: "Error rate is {{ $value }} errors per second"

          - alert: HighLatency
            expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 0.5
            for: 5m
            labels:
              severity: warning
            annotations:
              summary: "High latency detected"
              description: "95th percentile latency is {{ $value }} seconds"

          - alert: DatabaseConnectionsHigh
            expr: pg_stat_database_numbackends / pg_settings_max_connections > 0.8
            for: 5m
            labels:
              severity: warning
            annotations:
              summary: "Database connections high"
              description: "Database connections are at {{ $value }}% of maximum"

          - alert: RedisMemoryHigh
            expr: redis_memory_used_bytes / redis_memory_max_bytes > 0.9
            for: 5m
            labels:
              severity: critical
            annotations:
              summary: "Redis memory usage high"
              description: "Redis memory usage is at {{ $value }}%"

          - alert: PodCrashLooping
            expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
            for: 5m
            labels:
              severity: warning
            annotations:
              summary: "Pod is crash looping"
              description: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is crash looping"

# monitoring/grafana-dashboards.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-dashboards
  namespace: monitoring
data:
  webgis-overview.json: |
    {
      "dashboard": {
        "id": null,
        "title": "WebGIS Application Overview",
        "tags": ["webgis"],
        "style": "dark",
        "timezone": "browser",
        "panels": [
          {
            "id": 1,
            "title": "Request Rate",
            "type": "graph",
            "targets": [
              {
                "expr": "rate(http_requests_total{job='webgis-api'}[5m])",
                "legendFormat": "{{method}} {{path}}"
              }
            ],
            "yAxes": [
              {
                "label": "Requests/sec"
              }
            ]
          },
          {
            "id": 2,
            "title": "Response Time",
            "type": "graph",
            "targets": [
              {
                "expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket{job='webgis-api'}[5m]))",
                "legendFormat": "95th percentile"
              },
              {
                "expr": "histogram_quantile(0.50, rate(http_request_duration_seconds_bucket{job='webgis-api'}[5m]))",
                "legendFormat": "50th percentile"
              }
            ]
          },
          {
            "id": 3,
            "title": "Error Rate",
            "type": "singlestat",
            "targets": [
              {
                "expr": "rate(http_requests_total{job='webgis-api',status=~'5..'}[5m]) / rate(http_requests_total{job='webgis-api'}[5m])",
                "legendFormat": "Error Rate"
              }
            ],
            "valueName": "current",
            "format": "percentunit"
          },
          {
            "id": 4,
            "title": "Database Connections",
            "type": "graph",
            "targets": [
              {
                "expr": "pg_stat_database_numbackends",
                "legendFormat": "Active Connections"
              },
              {
                "expr": "pg_settings_max_connections",
                "legendFormat": "Max Connections"
              }
            ]
          },
          {
            "id": 5,
            "title": "Redis Memory Usage",
            "type": "graph",
            "targets": [
              {
                "expr": "redis_memory_used_bytes",
                "legendFormat": "Used Memory"
              },
              {
                "expr": "redis_memory_max_bytes",
                "legendFormat": "Max Memory"
              }
            ]
          },
          {
            "id": 6,
            "title": "Pod CPU Usage",
            "type": "graph",
            "targets": [
              {
                "expr": "rate(container_cpu_usage_seconds_total{namespace='webgis',container!='POD'}[5m])",
                "legendFormat": "{{pod}} - {{container}}"
              }
            ]
          },
          {
            "id": 7,
            "title": "Pod Memory Usage",
            "type": "graph",
            "targets": [
              {
                "expr": "container_memory_usage_bytes{namespace='webgis',container!='POD'}",
                "legendFormat": "{{pod}} - {{container}}"
              }
            ]
          },
          {
            "id": 8,
            "title": "Spatial Query Performance",
            "type": "graph",
            "targets": [
              {
                "expr": "histogram_quantile(0.95, rate(spatial_query_duration_seconds_bucket[5m]))",
                "legendFormat": "95th percentile"
              }
            ]
          }
        ],
        "time": {
          "from": "now-1h",
          "to": "now"
        },
        "refresh": "30s"
      }
    }

16.7.2. Application Monitoring Implementation#

// src/monitoring/metrics.ts
import { register, Counter, Histogram, Gauge } from 'prom-client';
import { Request, Response, NextFunction } from 'express';

// HTTP Metrics
export const httpRequestsTotal = new Counter({
  name: 'http_requests_total',
  help: 'Total number of HTTP requests',
  labelNames: ['method', 'path', 'status'],
});

export const httpRequestDuration = new Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['method', 'path', 'status'],
  buckets: [0.1, 0.3, 0.5, 0.7, 1, 3, 5, 7, 10],
});

// Database Metrics
export const databaseQueryDuration = new Histogram({
  name: 'database_query_duration_seconds',
  help: 'Duration of database queries in seconds',
  labelNames: ['query_type', 'table'],
  buckets: [0.01, 0.05, 0.1, 0.3, 0.5, 1, 3, 5],
});

export const databaseConnectionsActive = new Gauge({
  name: 'database_connections_active',
  help: 'Number of active database connections',
});

// Spatial Metrics
export const spatialQueryDuration = new Histogram({
  name: 'spatial_query_duration_seconds',
  help: 'Duration of spatial queries in seconds',
  labelNames: ['operation', 'geometry_type'],
  buckets: [0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10],
});

export const tilesGenerated = new Counter({
  name: 'tiles_generated_total',
  help: 'Total number of tiles generated',
  labelNames: ['zoom_level', 'layer'],
});

export const tileCacheHits = new Counter({
  name: 'tile_cache_hits_total',
  help: 'Total number of tile cache hits',
  labelNames: ['zoom_level', 'cache_type'],
});

// Application Metrics
export const activeUsers = new Gauge({
  name: 'active_users',
  help: 'Number of active users',
});

export const featureOperations = new Counter({
  name: 'feature_operations_total',
  help: 'Total number of feature operations',
  labelNames: ['operation', 'feature_type'],
});

export const mapViewsTotal = new Counter({
  name: 'map_views_total',
  help: 'Total number of map views',
  labelNames: ['zoom_level', 'region'],
});

// Middleware for HTTP metrics
export const metricsMiddleware = (req: Request, res: Response, next: NextFunction) => {
  const start = Date.now();
  
  res.on('finish', () => {
    const duration = (Date.now() - start) / 1000;
    const path = req.route?.path || req.path;
    const method = req.method;
    const status = res.statusCode.toString();
    
    httpRequestsTotal.inc({ method, path, status });
    httpRequestDuration.observe({ method, path, status }, duration);
  });
  
  next();
};

// Database monitoring
export class DatabaseMonitor {
  private db: any;
  
  constructor(database: any) {
    this.db = database;
    this.startConnectionMonitoring();
  }
  
  async monitorQuery<T>(
    queryType: string,
    table: string,
    queryFunction: () => Promise<T>
  ): Promise<T> {
    const start = Date.now();
    
    try {
      const result = await queryFunction();
      const duration = (Date.now() - start) / 1000;
      databaseQueryDuration.observe({ query_type: queryType, table }, duration);
      return result;
    } catch (error) {
      const duration = (Date.now() - start) / 1000;
      databaseQueryDuration.observe({ query_type: queryType, table }, duration);
      throw error;
    }
  }
  
  private startConnectionMonitoring(): void {
    setInterval(async () => {
      try {
        const result = await this.db.query(
          'SELECT count(*) FROM pg_stat_activity WHERE state = $1',
          ['active']
        );
        databaseConnectionsActive.set(parseInt(result.rows[0].count));
      } catch (error) {
        console.error('Failed to monitor database connections:', error);
      }
    }, 30000); // Every 30 seconds
  }
}

// Spatial operations monitoring
export class SpatialMonitor {
  static monitorSpatialQuery<T>(
    operation: string,
    geometryType: string,
    queryFunction: () => Promise<T>
  ): Promise<T> {
    const start = Date.now();
    
    return queryFunction()
      .then(result => {
        const duration = (Date.now() - start) / 1000;
        spatialQueryDuration.observe({ operation, geometry_type: geometryType }, duration);
        return result;
      })
      .catch(error => {
        const duration = (Date.now() - start) / 1000;
        spatialQueryDuration.observe({ operation, geometry_type: geometryType }, duration);
        throw error;
      });
  }
  
  static recordTileGeneration(zoomLevel: number, layer: string): void {
    tilesGenerated.inc({ zoom_level: zoomLevel.toString(), layer });
  }
  
  static recordTileCacheHit(zoomLevel: number, cacheType: string): void {
    tileCacheHits.inc({ zoom_level: zoomLevel.toString(), cache_type: cacheType });
  }
}

// User activity monitoring
export class UserActivityMonitor {
  private activeUsersSessions = new Set<string>();
  
  trackUserSession(sessionId: string): void {
    this.activeUsersSessions.add(sessionId);
    activeUsers.set(this.activeUsersSessions.size);
  }
  
  removeUserSession(sessionId: string): void {
    this.activeUsersSessions.delete(sessionId);
    activeUsers.set(this.activeUsersSessions.size);
  }
  
  recordFeatureOperation(operation: string, featureType: string): void {
    featureOperations.inc({ operation, feature_type: featureType });
  }
  
  recordMapView(zoomLevel: number, region: string): void {
    mapViewsTotal.inc({ zoom_level: zoomLevel.toString(), region });
  }
  
  // Cleanup inactive sessions
  startSessionCleanup(): void {
    setInterval(() => {
      // Implementation would check for inactive sessions and remove them
      // This is a simplified version
      activeUsers.set(this.activeUsersSessions.size);
    }, 60000); // Every minute
  }
}

// Health check endpoint
export const healthCheck = async (req: Request, res: Response) => {
  const health = {
    status: 'healthy',
    timestamp: new Date().toISOString(),
    services: {
      database: 'unknown',
      redis: 'unknown',
      storage: 'unknown'
    }
  };
  
  try {
    // Check database
    await req.app.locals.db.query('SELECT 1');
    health.services.database = 'healthy';
  } catch {
    health.services.database = 'unhealthy';
    health.status = 'unhealthy';
  }
  
  try {
    // Check Redis
    await req.app.locals.redis.ping();
    health.services.redis = 'healthy';
  } catch {
    health.services.redis = 'unhealthy';
    health.status = 'unhealthy';
  }
  
  const statusCode = health.status === 'healthy' ? 200 : 503;
  res.status(statusCode).json(health);
};

// Metrics endpoint
export const metricsEndpoint = (req: Request, res: Response) => {
  res.set('Content-Type', register.contentType);
  res.end(register.metrics());
};

// Initialize default metrics
register.clear();

16.8. Summary#

Modern deployment and DevOps practices for Web GIS applications require sophisticated orchestration of multiple specialized services, careful attention to geographic distribution, and robust monitoring of both traditional application metrics and geospatial-specific performance indicators.

Containerization with Docker provides the foundation for consistent deployments across environments while addressing the complex dependencies of geospatial services. Kubernetes orchestration enables scalable, resilient deployments with automated scaling based on both traditional metrics and spatial workload characteristics.

CI/CD pipelines automate the testing, building, and deployment process while incorporating geospatial-specific validation including spatial accuracy testing, performance benchmarking, and cross-browser visual validation for map rendering.

Infrastructure as Code with tools like Terraform ensures reproducible environments while properly configuring cloud services optimized for geospatial workloads including spatial databases, caching layers, and global content delivery networks.

Comprehensive monitoring and observability provide visibility into application performance, spatial query efficiency, and user experience metrics specific to mapping applications. This includes tracking tile generation performance, spatial query duration, and geographic distribution of usage patterns.

These deployment and operational practices enable Web GIS applications to scale reliably while maintaining the performance and availability requirements essential for interactive mapping experiences.

16.9. Exercises#

16.9.1. Exercise 16.1: Container Orchestration Setup#

Objective: Build a complete containerized deployment for a Web GIS application stack.

Instructions:

Multi-service containerization:
- Create optimized Dockerfiles for API, frontend, database, and tile server
- Implement health checks and proper signal handling
- Configure security scanning and vulnerability management
- Optimize image sizes and build times
Docker Compose orchestration:
- Set up complete development environment with Docker Compose
- Configure service dependencies and networking
- Implement persistent storage and backup strategies
- Add monitoring and logging services
Production optimization:
- Create production-optimized container configurations
- Implement multi-stage builds and security hardening
- Configure resource limits and performance tuning
- Add secrets management and environment configuration

Deliverable: Complete containerized deployment with development and production configurations.

16.9.2. Exercise 16.2: Kubernetes Deployment and Scaling#

Objective: Deploy and configure a Web GIS application on Kubernetes with auto-scaling.

Instructions:

Kubernetes manifests:
- Create complete Kubernetes deployment manifests
- Configure StatefulSets for databases and persistent services
- Implement ConfigMaps and Secrets for configuration management
- Set up Ingress controllers and load balancing
Auto-scaling configuration:
- Implement Horizontal Pod Autoscaler with custom metrics
- Configure cluster auto-scaling for node management
- Set up resource quotas and limits
- Add pod disruption budgets for high availability
Service mesh integration:
- Implement service mesh for inter-service communication
- Add traffic management and security policies
- Configure observability and distributed tracing
- Implement canary deployments and traffic splitting

Deliverable: Production-ready Kubernetes deployment with comprehensive scaling and management capabilities.

16.9.3. Exercise 16.3: CI/CD Pipeline Implementation#

Objective: Build comprehensive CI/CD pipelines for automated testing and deployment.

Instructions:

Automated testing pipeline:
- Implement multi-stage testing including unit, integration, and E2E tests
- Add security scanning and vulnerability assessment
- Configure performance testing and benchmarking
- Implement visual regression testing for map rendering
Build and deployment automation:
- Create automated Docker image building and scanning
- Implement artifact management and versioning
- Configure automated deployment to staging and production
- Add rollback capabilities and deployment validation
Quality gates and approvals:
- Implement quality gates based on test coverage and performance
- Configure manual approval workflows for production deployments
- Add automated notifications and status reporting
- Implement deployment monitoring and validation

Deliverable: Complete CI/CD pipeline with automated testing, building, and deployment capabilities.

16.9.4. Exercise 16.4: Infrastructure as Code#

Objective: Implement complete infrastructure automation using Terraform.

Instructions:

Cloud infrastructure provisioning:
- Create Terraform modules for VPC, networking, and security groups
- Provision managed database services optimized for spatial workloads
- Set up container orchestration platforms and managed Kubernetes
- Configure content delivery networks and edge locations
Application infrastructure:
- Provision monitoring and logging infrastructure
- Set up backup and disaster recovery systems
- Configure auto-scaling groups and load balancers
- Implement security scanning and compliance monitoring
Environment management:
- Create reusable modules for different environments
- Implement state management and remote backends
- Add cost optimization and resource tagging
- Configure drift detection and automated remediation

Deliverable: Complete Infrastructure as Code implementation with multi-environment support.

16.9.5. Exercise 16.5: Monitoring and Observability#

Objective: Implement comprehensive monitoring for Web GIS applications.

Instructions:

Application metrics:
- Implement custom metrics for spatial operations and performance
- Add user experience monitoring for map interactions
- Configure database and cache performance monitoring
- Create business metrics for feature usage and adoption
Infrastructure monitoring:
- Set up resource utilization monitoring across the stack
- Implement network performance and latency monitoring
- Add security monitoring and threat detection
- Configure capacity planning and scaling alerts
Observability platform:
- Build comprehensive dashboards for different stakeholders
- Implement alerting with proper escalation and notification
- Add distributed tracing for complex request flows
- Create automated reporting and analytics

Deliverable: Complete monitoring and observability platform with comprehensive coverage.

16.9.6. Exercise 16.6: High Availability and Disaster Recovery#

Objective: Implement high availability and disaster recovery for production systems.

Instructions:

High availability design:
- Implement multi-region deployment for global availability
- Configure database replication and failover mechanisms
- Set up load balancing and traffic distribution
- Add circuit breakers and graceful degradation
Disaster recovery planning:
- Create automated backup and restore procedures
- Implement cross-region data replication
- Configure disaster recovery testing and validation
- Add recovery time and recovery point objectives
Business continuity:
- Implement service health checks and automatic failover
- Configure maintenance mode and planned downtime procedures
- Add incident response and communication plans
- Create capacity planning for emergency scaling

Deliverable: Complete high availability and disaster recovery implementation with documented procedures.

16.9.7. Exercise 16.7: Performance Optimization and Scaling#

Objective: Optimize application performance and implement advanced scaling strategies.

Instructions:

Performance optimization:
- Implement caching strategies at multiple layers
- Optimize database queries and spatial operations
- Configure CDN and edge computing for global performance
- Add application-level performance monitoring and optimization
Scaling strategies:
- Implement predictive scaling based on usage patterns
- Configure geographic load distribution and routing
- Add queue-based processing for heavy spatial operations
- Implement database sharding and partitioning strategies
Cost optimization:
- Configure spot instances and reserved capacity
- Implement auto-scaling based on cost and performance metrics
- Add resource right-sizing and optimization recommendations
- Create cost monitoring and budget alerting

Deliverable: Highly optimized and efficiently scaling Web GIS application with comprehensive cost management.

Reflection Questions:

How do deployment requirements for Web GIS applications differ from traditional web applications?
What are the key considerations for scaling geospatial services globally?
How can monitoring be tailored to provide insights specific to mapping applications?
What are the critical components that need special attention in disaster recovery planning?

Chapter 16: Deployment and DevOps

Contents

16. Chapter 16: Deployment and DevOps#

16.1. Learning Objectives#

16.2. Modern Deployment Architecture#

16.2.1. Understanding Deployment Complexity#

16.2.2. Cloud-Native Architecture Patterns#

16.3. Containerization with Docker#

16.3.1. Docker Configuration for GIS Services#

16.3.2. Docker Compose for Development#

16.3.3. Production Docker Configurations#

16.4. Kubernetes Deployment#

16.4.1. Kubernetes Manifests#

16.4.2. Horizontal Pod Autoscaler#

16.5. CI/CD Pipelines#

16.5.1. GitHub Actions Workflow#

16.5.2. GitLab CI/CD Pipeline#

16.6. Infrastructure as Code#

16.6.1. Terraform Configuration#

16.7. Monitoring and Observability#

16.7.1. Prometheus and Grafana Setup#

16.7.2. Application Monitoring Implementation#

16.8. Summary#

16.9. Exercises#

16.9.1. Exercise 16.1: Container Orchestration Setup#

16.9.2. Exercise 16.2: Kubernetes Deployment and Scaling#

16.9.3. Exercise 16.3: CI/CD Pipeline Implementation#

16.9.4. Exercise 16.4: Infrastructure as Code#

16.9.5. Exercise 16.5: Monitoring and Observability#

16.9.6. Exercise 16.6: High Availability and Disaster Recovery#

16.9.7. Exercise 16.7: Performance Optimization and Scaling#

16.10. Further Reading#