16. Chapter 16: Deployment and DevOps#
16.1. Learning Objectives#
By the end of this chapter, you will understand:
Modern deployment strategies for Web GIS applications
Containerization with Docker for geospatial services
Orchestration with Kubernetes for scalable deployments
CI/CD pipelines for automated testing and deployment
Infrastructure as Code for reproducible environments
Monitoring and observability for production systems
Scaling strategies for high-traffic geospatial applications
16.2. Modern Deployment Architecture#
Web GIS applications require sophisticated deployment strategies that address the unique challenges of geospatial data processing, real-time updates, and global accessibility. Modern deployment architectures leverage containerization, orchestration, and automation to ensure reliable, scalable, and maintainable systems.
16.2.1. Understanding Deployment Complexity#
Geospatial Service Dependencies: Web GIS applications typically require multiple specialized services including PostGIS databases, tile servers, spatial processing engines, and real-time data ingestion pipelines. Each component has specific resource requirements, scaling characteristics, and operational dependencies that must be carefully orchestrated in production environments.
Geographic Distribution Requirements: Unlike traditional web applications, Web GIS systems often need to serve users globally while maintaining low latency for map interactions. This requires strategic placement of services, data replication across regions, and intelligent routing to ensure optimal performance regardless of user location.
Data Pipeline Complexity: Geospatial applications frequently involve complex data processing pipelines that transform raw geographic data into optimized formats for web delivery. These pipelines may include batch processing jobs, real-time streaming processors, and on-demand tile generation services that must be reliably coordinated in production.
Security and Compliance: Location data often contains sensitive information requiring careful handling according to various privacy regulations. Deployment architectures must implement proper data isolation, encryption, access controls, and audit logging while maintaining system performance and usability.
16.2.2. Cloud-Native Architecture Patterns#
Microservices for Spatial Functions: Breaking down Web GIS applications into focused microservices enables independent scaling, deployment, and maintenance of different functional areas. Spatial query services, tile generation, user management, and real-time processing can be developed, deployed, and scaled independently based on their specific requirements.
Event-Driven Architecture: Geospatial applications benefit significantly from event-driven patterns that enable real-time updates, asynchronous processing, and loose coupling between components. Location updates, data changes, and user interactions can trigger events that flow through the system, enabling responsive and scalable applications.
API Gateway Patterns: Centralized API gateways provide essential capabilities for Web GIS deployments including authentication, rate limiting, request routing, and protocol translation. They enable unified access to distributed spatial services while providing operational visibility and control.
16.3. Containerization with Docker#
16.3.1. Docker Configuration for GIS Services#
# Dockerfile for Node.js Web GIS Backend
FROM node:18-alpine AS base
# Install system dependencies for spatial operations
RUN apk add --no-cache \
gdal-dev \
geos-dev \
proj-dev \
sqlite-dev \
postgresql-client \
curl \
&& rm -rf /var/cache/apk/*
WORKDIR /app
# Copy package files for dependency installation
COPY package*.json ./
COPY yarn.lock ./
# Install dependencies
FROM base AS dependencies
RUN yarn install --frozen-lockfile --production=false
# Build stage
FROM dependencies AS build
COPY . .
RUN yarn build
RUN yarn install --frozen-lockfile --production=true
# Production stage
FROM base AS production
COPY --from=build /app/dist ./dist
COPY --from=build /app/node_modules ./node_modules
COPY --from=build /app/package.json ./package.json
# Create non-root user for security
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nodejs -u 1001
USER nodejs
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
EXPOSE 3000
CMD ["node", "dist/server.js"]
# Dockerfile for PostGIS Database
FROM postgis/postgis:15-3.3-alpine
# Install additional extensions
RUN apk add --no-cache \
postgresql-contrib \
&& rm -rf /var/cache/apk/*
# Copy initialization scripts
COPY ./docker/postgis/init/ /docker-entrypoint-initdb.d/
# Environment variables
ENV POSTGRES_DB=webgis
ENV POSTGRES_USER=webgis
ENV POSTGRES_PASSWORD=secure_password
ENV POSTGIS_ENABLE_OUTDB_RASTERS=1
ENV POSTGIS_GDAL_ENABLED_DRIVERS="ENABLE_ALL"
# Custom postgresql.conf for spatial workloads
COPY ./docker/postgis/postgresql.conf /etc/postgresql/postgresql.conf
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD pg_isready -U $POSTGRES_USER -d $POSTGRES_DB || exit 1
EXPOSE 5432
# Dockerfile for Redis Cache
FROM redis:7-alpine
# Copy custom Redis configuration
COPY ./docker/redis/redis.conf /etc/redis/redis.conf
# Create directory for Redis data
RUN mkdir -p /data && chown redis:redis /data
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD redis-cli ping || exit 1
EXPOSE 6379
CMD ["redis-server", "/etc/redis/redis.conf"]
# Dockerfile for Frontend Application
FROM node:18-alpine AS build
WORKDIR /app
# Install dependencies
COPY package*.json ./
RUN npm ci --only=production
# Copy source code and build
COPY . .
RUN npm run build
# Production stage with nginx
FROM nginx:alpine
# Copy built application
COPY --from=build /app/dist /usr/share/nginx/html
# Copy nginx configuration
COPY ./docker/nginx/nginx.conf /etc/nginx/nginx.conf
COPY ./docker/nginx/default.conf /etc/nginx/conf.d/default.conf
# Create nginx user and set permissions
RUN adduser -D -S -h /var/cache/nginx -s /sbin/nologin -G nginx nginx
RUN chown -R nginx:nginx /usr/share/nginx/html
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:80/health || exit 1
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
16.3.2. Docker Compose for Development#
# docker-compose.yml
version: '3.8'
services:
# Database
postgis:
build:
context: .
dockerfile: docker/postgis/Dockerfile
container_name: webgis-postgis
environment:
POSTGRES_DB: webgis
POSTGRES_USER: webgis
POSTGRES_PASSWORD: secure_password
volumes:
- postgis_data:/var/lib/postgresql/data
- ./docker/postgis/init:/docker-entrypoint-initdb.d
ports:
- "5432:5432"
networks:
- webgis-network
restart: unless-stopped
# Cache
redis:
build:
context: .
dockerfile: docker/redis/Dockerfile
container_name: webgis-redis
volumes:
- redis_data:/data
ports:
- "6379:6379"
networks:
- webgis-network
restart: unless-stopped
# Backend API
api:
build:
context: .
dockerfile: docker/api/Dockerfile
container_name: webgis-api
environment:
NODE_ENV: development
DATABASE_URL: postgresql://webgis:secure_password@postgis:5432/webgis
REDIS_URL: redis://redis:6379
JWT_SECRET: your-jwt-secret
PORT: 3000
volumes:
- ./src:/app/src
- ./uploads:/app/uploads
ports:
- "3000:3000"
depends_on:
- postgis
- redis
networks:
- webgis-network
restart: unless-stopped
# Frontend
frontend:
build:
context: .
dockerfile: docker/frontend/Dockerfile
container_name: webgis-frontend
environment:
REACT_APP_API_URL: http://localhost:3000/api
REACT_APP_MAPBOX_TOKEN: your-mapbox-token
ports:
- "80:80"
depends_on:
- api
networks:
- webgis-network
restart: unless-stopped
# Tile Server
tile-server:
build:
context: .
dockerfile: docker/tiles/Dockerfile
container_name: webgis-tiles
environment:
DATABASE_URL: postgresql://webgis:secure_password@postgis:5432/webgis
CACHE_URL: redis://redis:6379
TILE_SIZE: 512
MAX_ZOOM: 18
volumes:
- tile_cache:/app/cache
ports:
- "8080:8080"
depends_on:
- postgis
- redis
networks:
- webgis-network
restart: unless-stopped
# Background Workers
worker:
build:
context: .
dockerfile: docker/worker/Dockerfile
container_name: webgis-worker
environment:
NODE_ENV: development
DATABASE_URL: postgresql://webgis:secure_password@postgis:5432/webgis
REDIS_URL: redis://redis:6379
WORKER_CONCURRENCY: 4
volumes:
- ./data:/app/data
depends_on:
- postgis
- redis
networks:
- webgis-network
restart: unless-stopped
# Monitoring
prometheus:
image: prom/prometheus:latest
container_name: webgis-prometheus
volumes:
- ./docker/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
ports:
- "9090:9090"
networks:
- webgis-network
restart: unless-stopped
grafana:
image: grafana/grafana:latest
container_name: webgis-grafana
environment:
GF_SECURITY_ADMIN_PASSWORD: admin
volumes:
- grafana_data:/var/lib/grafana
- ./docker/grafana/dashboards:/etc/grafana/provisioning/dashboards
- ./docker/grafana/datasources:/etc/grafana/provisioning/datasources
ports:
- "3001:3000"
depends_on:
- prometheus
networks:
- webgis-network
restart: unless-stopped
volumes:
postgis_data:
redis_data:
tile_cache:
prometheus_data:
grafana_data:
networks:
webgis-network:
driver: bridge
16.3.3. Production Docker Configurations#
# docker-compose.prod.yml
version: '3.8'
services:
# Production PostgreSQL with optimizations
postgis:
build:
context: .
dockerfile: docker/postgis/Dockerfile.prod
environment:
POSTGRES_DB: ${POSTGRES_DB}
POSTGRES_USER: ${POSTGRES_USER}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
volumes:
- postgis_data:/var/lib/postgresql/data
- ./docker/postgis/production.conf:/etc/postgresql/postgresql.conf
deploy:
replicas: 1
resources:
limits:
memory: 4G
cpus: '2'
reservations:
memory: 2G
cpus: '1'
restart_policy:
condition: on-failure
delay: 10s
max_attempts: 3
networks:
- webgis-backend
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
# Production Redis with clustering
redis:
image: redis:7-alpine
command: redis-server --appendonly yes --cluster-enabled yes
volumes:
- redis_data:/data
deploy:
replicas: 3
resources:
limits:
memory: 1G
cpus: '0.5'
restart_policy:
condition: on-failure
networks:
- webgis-backend
# Load-balanced API services
api:
build:
context: .
dockerfile: docker/api/Dockerfile.prod
environment:
NODE_ENV: production
DATABASE_URL: ${DATABASE_URL}
REDIS_URL: ${REDIS_URL}
JWT_SECRET: ${JWT_SECRET}
deploy:
replicas: 3
resources:
limits:
memory: 1G
cpus: '1'
restart_policy:
condition: on-failure
update_config:
parallelism: 1
delay: 10s
order: start-first
networks:
- webgis-backend
- webgis-frontend
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
# Production frontend with CDN
frontend:
build:
context: .
dockerfile: docker/frontend/Dockerfile.prod
deploy:
replicas: 2
resources:
limits:
memory: 512M
cpus: '0.5'
restart_policy:
condition: on-failure
networks:
- webgis-frontend
# Load balancer
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./docker/nginx/production.conf:/etc/nginx/nginx.conf
- ./ssl:/etc/ssl/certs
deploy:
replicas: 2
resources:
limits:
memory: 256M
cpus: '0.5'
networks:
- webgis-frontend
depends_on:
- api
- frontend
volumes:
postgis_data:
driver: local
driver_opts:
type: nfs
o: addr=${NFS_SERVER},rw
device: ":${NFS_PATH}/postgis"
redis_data:
driver: local
networks:
webgis-frontend:
driver: overlay
attachable: true
webgis-backend:
driver: overlay
internal: true
16.4. Kubernetes Deployment#
16.4.1. Kubernetes Manifests#
# k8s/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: webgis
labels:
name: webgis
environment: production
---
# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: webgis-config
namespace: webgis
data:
DATABASE_HOST: "postgis-service"
DATABASE_PORT: "5432"
DATABASE_NAME: "webgis"
REDIS_HOST: "redis-service"
REDIS_PORT: "6379"
NODE_ENV: "production"
LOG_LEVEL: "info"
TILE_SIZE: "512"
MAX_ZOOM: "18"
---
# k8s/secrets.yaml
apiVersion: v1
kind: Secret
metadata:
name: webgis-secrets
namespace: webgis
type: Opaque
data:
DATABASE_PASSWORD: <base64-encoded-password>
JWT_SECRET: <base64-encoded-jwt-secret>
REDIS_PASSWORD: <base64-encoded-redis-password>
MAPBOX_TOKEN: <base64-encoded-mapbox-token>
# k8s/postgis-deployment.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgis
namespace: webgis
spec:
serviceName: postgis-service
replicas: 1
selector:
matchLabels:
app: postgis
template:
metadata:
labels:
app: postgis
spec:
containers:
- name: postgis
image: webgis/postgis:latest
env:
- name: POSTGRES_DB
valueFrom:
configMapKeyRef:
name: webgis-config
key: DATABASE_NAME
- name: POSTGRES_USER
value: "webgis"
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: webgis-secrets
key: DATABASE_PASSWORD
ports:
- containerPort: 5432
volumeMounts:
- name: postgis-storage
mountPath: /var/lib/postgresql/data
resources:
requests:
memory: "2Gi"
cpu: "1"
limits:
memory: "4Gi"
cpu: "2"
livenessProbe:
exec:
command:
- pg_isready
- -U
- webgis
- -d
- webgis
initialDelaySeconds: 60
periodSeconds: 30
readinessProbe:
exec:
command:
- pg_isready
- -U
- webgis
- -d
- webgis
initialDelaySeconds: 30
periodSeconds: 10
volumeClaimTemplates:
- metadata:
name: postgis-storage
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "fast-ssd"
resources:
requests:
storage: 100Gi
---
apiVersion: v1
kind: Service
metadata:
name: postgis-service
namespace: webgis
spec:
selector:
app: postgis
ports:
- port: 5432
targetPort: 5432
type: ClusterIP
# k8s/redis-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
namespace: webgis
spec:
replicas: 3
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:7-alpine
command: ["redis-server"]
args: ["--appendonly", "yes", "--cluster-enabled", "yes"]
ports:
- containerPort: 6379
- containerPort: 16379
volumeMounts:
- name: redis-storage
mountPath: /data
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
livenessProbe:
exec:
command:
- redis-cli
- ping
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- redis-cli
- ping
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: redis-storage
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: redis-service
namespace: webgis
spec:
selector:
app: redis
ports:
- port: 6379
targetPort: 6379
type: ClusterIP
# k8s/api-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: webgis-api
namespace: webgis
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
selector:
matchLabels:
app: webgis-api
template:
metadata:
labels:
app: webgis-api
spec:
containers:
- name: api
image: webgis/api:latest
envFrom:
- configMapRef:
name: webgis-config
env:
- name: DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: webgis-secrets
key: DATABASE_PASSWORD
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: webgis-secrets
key: JWT_SECRET
ports:
- containerPort: 3000
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 10
periodSeconds: 10
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15"]
---
apiVersion: v1
kind: Service
metadata:
name: webgis-api-service
namespace: webgis
spec:
selector:
app: webgis-api
ports:
- port: 3000
targetPort: 3000
type: ClusterIP
# k8s/frontend-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: webgis-frontend
namespace: webgis
spec:
replicas: 2
selector:
matchLabels:
app: webgis-frontend
template:
metadata:
labels:
app: webgis-frontend
spec:
containers:
- name: frontend
image: webgis/frontend:latest
ports:
- containerPort: 80
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "250m"
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 5
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: webgis-frontend-service
namespace: webgis
spec:
selector:
app: webgis-frontend
ports:
- port: 80
targetPort: 80
type: ClusterIP
# k8s/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: webgis-ingress
namespace: webgis
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rewrite-target: /$2
nginx.ingress.kubernetes.io/rate-limit-rps: "100"
nginx.ingress.kubernetes.io/rate-limit-connections: "10"
spec:
tls:
- hosts:
- webgis.example.com
secretName: webgis-tls
rules:
- host: webgis.example.com
http:
paths:
- path: /api(/|$)(.*)
pathType: Prefix
backend:
service:
name: webgis-api-service
port:
number: 3000
- path: /
pathType: Prefix
backend:
service:
name: webgis-frontend-service
port:
number: 80
16.4.2. Horizontal Pod Autoscaler#
# k8s/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: webgis-api-hpa
namespace: webgis
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: webgis-api
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "1000"
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 60
- type: Pods
value: 4
periodSeconds: 60
selectPolicy: Max
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: webgis-frontend-hpa
namespace: webgis
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: webgis-frontend
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70
16.5. CI/CD Pipelines#
16.5.1. GitHub Actions Workflow#
# .github/workflows/ci-cd.yml
name: CI/CD Pipeline
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
test:
runs-on: ubuntu-latest
services:
postgres:
image: postgis/postgis:15-3.3
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: webgis_test
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 5432:5432
redis:
image: redis:7-alpine
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 6379:6379
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '18'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run linting
run: npm run lint
- name: Run type checking
run: npm run type-check
- name: Run unit tests
run: npm run test:unit
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/webgis_test
REDIS_URL: redis://localhost:6379
- name: Run integration tests
run: npm run test:integration
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/webgis_test
REDIS_URL: redis://localhost:6379
- name: Setup Playwright
uses: microsoft/playwright-github-action@v1
- name: Run E2E tests
run: npm run test:e2e
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/webgis_test
REDIS_URL: redis://localhost:6379
- name: Upload test results
uses: actions/upload-artifact@v3
if: always()
with:
name: test-results
path: |
coverage/
test-results/
playwright-report/
security:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
scan-type: 'fs'
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload Trivy scan results
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
- name: Run npm audit
run: npm audit --audit-level moderate
build:
needs: [test, security]
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
strategy:
matrix:
component: [api, frontend, tiles, worker]
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Log in to Container Registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v4
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/${{ matrix.component }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=sha,prefix={{branch}}-
type=raw,value=latest,enable={{is_default_branch}}
- name: Build and push Docker image
uses: docker/build-push-action@v4
with:
context: .
file: ./docker/${{ matrix.component }}/Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
deploy-staging:
needs: build
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/develop'
environment:
name: staging
url: https://staging.webgis.example.com
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Configure kubectl
uses: azure/k8s-set-context@v3
with:
method: kubeconfig
kubeconfig: ${{ secrets.KUBE_CONFIG_STAGING }}
- name: Deploy to staging
run: |
# Update image tags in manifests
sed -i "s|webgis/api:latest|${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/api:develop|g" k8s/api-deployment.yaml
sed -i "s|webgis/frontend:latest|${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/frontend:develop|g" k8s/frontend-deployment.yaml
# Apply manifests
kubectl apply -f k8s/ -n webgis-staging
# Wait for rollout
kubectl rollout status deployment/webgis-api -n webgis-staging --timeout=600s
kubectl rollout status deployment/webgis-frontend -n webgis-staging --timeout=600s
- name: Run smoke tests
run: |
# Wait for service to be ready
sleep 30
# Run basic health checks
curl -f https://staging.webgis.example.com/health || exit 1
curl -f https://staging.webgis.example.com/api/health || exit 1
deploy-production:
needs: [build, deploy-staging]
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
environment:
name: production
url: https://webgis.example.com
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Configure kubectl
uses: azure/k8s-set-context@v3
with:
method: kubeconfig
kubeconfig: ${{ secrets.KUBE_CONFIG_PRODUCTION }}
- name: Blue-Green Deployment
run: |
# Create new deployment with green suffix
sed -i "s|webgis-api|webgis-api-green|g" k8s/api-deployment.yaml
sed -i "s|webgis-frontend|webgis-frontend-green|g" k8s/frontend-deployment.yaml
sed -i "s|webgis/api:latest|${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/api:main|g" k8s/api-deployment.yaml
sed -i "s|webgis/frontend:latest|${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/frontend:main|g" k8s/frontend-deployment.yaml
# Deploy green version
kubectl apply -f k8s/api-deployment.yaml -n webgis
kubectl apply -f k8s/frontend-deployment.yaml -n webgis
# Wait for green deployment
kubectl rollout status deployment/webgis-api-green -n webgis --timeout=600s
kubectl rollout status deployment/webgis-frontend-green -n webgis --timeout=600s
- name: Health Check Green Deployment
run: |
# Test green deployment
GREEN_API_POD=$(kubectl get pods -l app=webgis-api-green -n webgis -o jsonpath='{.items[0].metadata.name}')
kubectl port-forward $GREEN_API_POD 8080:3000 -n webgis &
sleep 10
curl -f http://localhost:8080/health || exit 1
kill %1
- name: Switch Traffic to Green
run: |
# Update service selectors to point to green
kubectl patch service webgis-api-service -n webgis -p '{"spec":{"selector":{"app":"webgis-api-green"}}}'
kubectl patch service webgis-frontend-service -n webgis -p '{"spec":{"selector":{"app":"webgis-frontend-green"}}}'
- name: Cleanup Blue Deployment
run: |
# Remove old blue deployment
kubectl delete deployment webgis-api -n webgis --ignore-not-found=true
kubectl delete deployment webgis-frontend -n webgis --ignore-not-found=true
# Rename green to blue for next deployment
kubectl patch deployment webgis-api-green -n webgis -p '{"metadata":{"name":"webgis-api"}}'
kubectl patch deployment webgis-frontend-green -n webgis -p '{"metadata":{"name":"webgis-frontend"}}'
- name: Post-deployment Tests
run: |
# Run comprehensive post-deployment tests
sleep 60
curl -f https://webgis.example.com/health || exit 1
curl -f https://webgis.example.com/api/health || exit 1
# Run critical user journey tests
npm run test:smoke -- --environment=production
notify:
needs: [deploy-production]
runs-on: ubuntu-latest
if: always()
steps:
- name: Notify Slack
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
channel: '#deployments'
webhook_url: ${{ secrets.SLACK_WEBHOOK }}
fields: repo,message,commit,author,action,eventName,ref,workflow
16.5.2. GitLab CI/CD Pipeline#
# .gitlab-ci.yml
variables:
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: "/certs"
POSTGRES_DB: webgis_test
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
REDIS_URL: redis://redis:6379
DATABASE_URL: postgresql://postgres:postgres@postgres:5432/webgis_test
stages:
- test
- security
- build
- deploy-staging
- deploy-production
# Test stage
test:unit:
stage: test
image: node:18-alpine
services:
- name: postgis/postgis:15-3.3
alias: postgres
variables:
POSTGRES_DB: $POSTGRES_DB
POSTGRES_USER: $POSTGRES_USER
POSTGRES_PASSWORD: $POSTGRES_PASSWORD
- name: redis:7-alpine
alias: redis
before_script:
- apk add --no-cache curl
- npm ci
script:
- npm run lint
- npm run type-check
- npm run test:unit
- npm run test:integration
coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
junit: junit.xml
paths:
- coverage/
expire_in: 1 week
test:e2e:
stage: test
image: mcr.microsoft.com/playwright:v1.40.0-focal
services:
- name: postgis/postgis:15-3.3
alias: postgres
variables:
POSTGRES_DB: $POSTGRES_DB
POSTGRES_USER: $POSTGRES_USER
POSTGRES_PASSWORD: $POSTGRES_PASSWORD
- name: redis:7-alpine
alias: redis
before_script:
- npm ci
- npx playwright install
script:
- npm run test:e2e
artifacts:
when: always
paths:
- playwright-report/
- test-results/
expire_in: 1 week
# Security stage
security:scan:
stage: security
image: aquasec/trivy:latest
script:
- trivy fs --exit-code 1 --severity HIGH,CRITICAL .
allow_failure: true
security:dependencies:
stage: security
image: node:18-alpine
script:
- npm audit --audit-level moderate
allow_failure: true
# Build stage
.build_template: &build_template
stage: build
image: docker:24-dind
services:
- docker:24-dind
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- docker build -f docker/$COMPONENT/Dockerfile -t $CI_REGISTRY_IMAGE/$COMPONENT:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE/$COMPONENT:$CI_COMMIT_SHA
- |
if [ "$CI_COMMIT_BRANCH" = "main" ]; then
docker tag $CI_REGISTRY_IMAGE/$COMPONENT:$CI_COMMIT_SHA $CI_REGISTRY_IMAGE/$COMPONENT:latest
docker push $CI_REGISTRY_IMAGE/$COMPONENT:latest
fi
build:api:
<<: *build_template
variables:
COMPONENT: api
build:frontend:
<<: *build_template
variables:
COMPONENT: frontend
build:tiles:
<<: *build_template
variables:
COMPONENT: tiles
build:worker:
<<: *build_template
variables:
COMPONENT: worker
# Staging deployment
deploy:staging:
stage: deploy-staging
image: bitnami/kubectl:latest
environment:
name: staging
url: https://staging.webgis.example.com
before_script:
- echo $KUBE_CONFIG_STAGING | base64 -d > kubeconfig
- export KUBECONFIG=kubeconfig
script:
- sed -i "s|webgis/api:latest|$CI_REGISTRY_IMAGE/api:$CI_COMMIT_SHA|g" k8s/api-deployment.yaml
- sed -i "s|webgis/frontend:latest|$CI_REGISTRY_IMAGE/frontend:$CI_COMMIT_SHA|g" k8s/frontend-deployment.yaml
- kubectl apply -f k8s/ -n webgis-staging
- kubectl rollout status deployment/webgis-api -n webgis-staging --timeout=600s
- kubectl rollout status deployment/webgis-frontend -n webgis-staging --timeout=600s
only:
- develop
# Production deployment
deploy:production:
stage: deploy-production
image: bitnami/kubectl:latest
environment:
name: production
url: https://webgis.example.com
before_script:
- echo $KUBE_CONFIG_PRODUCTION | base64 -d > kubeconfig
- export KUBECONFIG=kubeconfig
script:
- sed -i "s|webgis/api:latest|$CI_REGISTRY_IMAGE/api:$CI_COMMIT_SHA|g" k8s/api-deployment.yaml
- sed -i "s|webgis/frontend:latest|$CI_REGISTRY_IMAGE/frontend:$CI_COMMIT_SHA|g" k8s/frontend-deployment.yaml
- kubectl apply -f k8s/ -n webgis
- kubectl rollout status deployment/webgis-api -n webgis --timeout=600s
- kubectl rollout status deployment/webgis-frontend -n webgis --timeout=600s
when: manual
only:
- main
16.6. Infrastructure as Code#
16.6.1. Terraform Configuration#
# infrastructure/main.tf
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.23"
}
}
backend "s3" {
bucket = "webgis-terraform-state"
key = "infrastructure/terraform.tfstate"
region = "us-west-2"
}
}
provider "aws" {
region = var.aws_region
}
# VPC and Networking
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
name = "${var.project_name}-vpc"
cidr = "10.0.0.0/16"
azs = ["${var.aws_region}a", "${var.aws_region}b", "${var.aws_region}c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
enable_nat_gateway = true
enable_vpn_gateway = false
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Project = var.project_name
Environment = var.environment
}
}
# EKS Cluster
module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = "${var.project_name}-cluster"
cluster_version = "1.28"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_endpoint_private_access = true
cluster_endpoint_public_access = true
cluster_addons = {
coredns = {
most_recent = true
}
kube-proxy = {
most_recent = true
}
vpc-cni = {
most_recent = true
}
aws-ebs-csi-driver = {
most_recent = true
}
}
eks_managed_node_groups = {
general = {
desired_size = 3
max_size = 10
min_size = 3
instance_types = ["t3.large"]
capacity_type = "ON_DEMAND"
k8s_labels = {
Environment = var.environment
NodeGroup = "general"
}
}
spatial = {
desired_size = 2
max_size = 8
min_size = 2
instance_types = ["m5.xlarge"]
capacity_type = "SPOT"
k8s_labels = {
Environment = var.environment
NodeGroup = "spatial"
Workload = "spatial-processing"
}
taints = [
{
key = "spatial-workload"
value = "true"
effect = "NO_SCHEDULE"
}
]
}
}
tags = {
Project = var.project_name
Environment = var.environment
}
}
# RDS PostgreSQL with PostGIS
resource "aws_db_subnet_group" "main" {
name = "${var.project_name}-db-subnet-group"
subnet_ids = module.vpc.private_subnets
tags = {
Name = "${var.project_name} DB subnet group"
}
}
resource "aws_security_group" "rds" {
name_prefix = "${var.project_name}-rds"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [module.eks.node_security_group_id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.project_name}-rds-sg"
}
}
resource "aws_db_instance" "main" {
identifier = "${var.project_name}-db"
engine = "postgres"
engine_version = "15.4"
instance_class = "db.r5.xlarge"
allocated_storage = 100
max_allocated_storage = 1000
storage_type = "gp3"
storage_encrypted = true
db_name = "webgis"
username = "webgis"
password = var.db_password
vpc_security_group_ids = [aws_security_group.rds.id]
db_subnet_group_name = aws_db_subnet_group.main.name
backup_retention_period = 7
backup_window = "03:00-04:00"
maintenance_window = "sun:04:00-sun:05:00"
skip_final_snapshot = false
final_snapshot_identifier = "${var.project_name}-db-final-snapshot"
performance_insights_enabled = true
monitoring_interval = 60
monitoring_role_arn = aws_iam_role.rds_monitoring.arn
tags = {
Name = "${var.project_name}-database"
Environment = var.environment
}
}
# ElastiCache Redis
resource "aws_elasticache_subnet_group" "main" {
name = "${var.project_name}-cache-subnet"
subnet_ids = module.vpc.private_subnets
}
resource "aws_security_group" "redis" {
name_prefix = "${var.project_name}-redis"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 6379
to_port = 6379
protocol = "tcp"
security_groups = [module.eks.node_security_group_id]
}
tags = {
Name = "${var.project_name}-redis-sg"
}
}
resource "aws_elasticache_replication_group" "main" {
replication_group_id = "${var.project_name}-redis"
description = "Redis cluster for ${var.project_name}"
port = 6379
parameter_group_name = "default.redis7"
node_type = "cache.r5.large"
num_cache_clusters = 3
subnet_group_name = aws_elasticache_subnet_group.main.name
security_group_ids = [aws_security_group.redis.id]
at_rest_encryption_enabled = true
transit_encryption_enabled = true
tags = {
Name = "${var.project_name}-redis"
Environment = var.environment
}
}
# S3 Buckets
resource "aws_s3_bucket" "assets" {
bucket = "${var.project_name}-assets-${random_id.bucket_suffix.hex}"
tags = {
Name = "${var.project_name} Assets"
Environment = var.environment
}
}
resource "aws_s3_bucket_versioning" "assets" {
bucket = aws_s3_bucket.assets.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_server_side_encryption_configuration" "assets" {
bucket = aws_s3_bucket.assets.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
resource "aws_s3_bucket" "backups" {
bucket = "${var.project_name}-backups-${random_id.bucket_suffix.hex}"
tags = {
Name = "${var.project_name} Backups"
Environment = var.environment
}
}
resource "aws_s3_bucket_lifecycle_configuration" "backups" {
bucket = aws_s3_bucket.backups.id
rule {
id = "backup_lifecycle"
status = "Enabled"
expiration {
days = 90
}
noncurrent_version_expiration {
noncurrent_days = 30
}
}
}
# CloudFront Distribution
resource "aws_cloudfront_distribution" "main" {
origin {
domain_name = aws_lb.main.dns_name
origin_id = "ALB-${var.project_name}"
custom_origin_config {
http_port = 80
https_port = 443
origin_protocol_policy = "https-only"
origin_ssl_protocols = ["TLSv1.2"]
}
}
enabled = true
is_ipv6_enabled = true
default_root_object = "index.html"
default_cache_behavior {
allowed_methods = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "ALB-${var.project_name}"
compress = true
viewer_protocol_policy = "redirect-to-https"
forwarded_values {
query_string = true
cookies {
forward = "none"
}
}
min_ttl = 0
default_ttl = 3600
max_ttl = 86400
}
# Cache behavior for API endpoints
ordered_cache_behavior {
path_pattern = "/api/*"
allowed_methods = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "ALB-${var.project_name}"
compress = true
viewer_protocol_policy = "redirect-to-https"
forwarded_values {
query_string = true
headers = ["Authorization", "CloudFront-Forwarded-Proto"]
cookies {
forward = "all"
}
}
min_ttl = 0
default_ttl = 0
max_ttl = 0
}
# Cache behavior for tiles
ordered_cache_behavior {
path_pattern = "/tiles/*"
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "ALB-${var.project_name}"
compress = true
viewer_protocol_policy = "redirect-to-https"
forwarded_values {
query_string = false
cookies {
forward = "none"
}
}
min_ttl = 86400
default_ttl = 86400
max_ttl = 31536000
}
restrictions {
geo_restriction {
restriction_type = "none"
}
}
viewer_certificate {
acm_certificate_arn = aws_acm_certificate.main.arn
ssl_support_method = "sni-only"
minimum_protocol_version = "TLSv1.2_2021"
}
tags = {
Name = "${var.project_name}-cloudfront"
Environment = var.environment
}
}
# Application Load Balancer
resource "aws_lb" "main" {
name = "${var.project_name}-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = module.vpc.public_subnets
enable_deletion_protection = false
tags = {
Name = "${var.project_name}-alb"
Environment = var.environment
}
}
resource "aws_security_group" "alb" {
name_prefix = "${var.project_name}-alb"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.project_name}-alb-sg"
}
}
# IAM Roles
resource "aws_iam_role" "rds_monitoring" {
name = "${var.project_name}-rds-monitoring-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "monitoring.rds.amazonaws.com"
}
}
]
})
}
resource "aws_iam_role_policy_attachment" "rds_monitoring" {
role = aws_iam_role.rds_monitoring.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonRDSEnhancedMonitoringRole"
}
# Random ID for unique resource names
resource "random_id" "bucket_suffix" {
byte_length = 4
}
# Variables
variable "project_name" {
description = "Name of the project"
type = string
default = "webgis"
}
variable "environment" {
description = "Environment name"
type = string
default = "production"
}
variable "aws_region" {
description = "AWS region"
type = string
default = "us-west-2"
}
variable "db_password" {
description = "Database password"
type = string
sensitive = true
}
# Outputs
output "cluster_endpoint" {
description = "EKS cluster endpoint"
value = module.eks.cluster_endpoint
}
output "cluster_name" {
description = "EKS cluster name"
value = module.eks.cluster_name
}
output "database_endpoint" {
description = "RDS instance endpoint"
value = aws_db_instance.main.endpoint
}
output "redis_endpoint" {
description = "Redis cluster endpoint"
value = aws_elasticache_replication_group.main.primary_endpoint_address
}
output "cloudfront_domain" {
description = "CloudFront distribution domain"
value = aws_cloudfront_distribution.main.domain_name
}
16.7. Monitoring and Observability#
16.7.1. Prometheus and Grafana Setup#
# monitoring/prometheus-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: monitoring
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- "/etc/prometheus/rules/*.yml"
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'webgis-api'
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- webgis
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
action: keep
regex: webgis-api-service
- source_labels: [__meta_kubernetes_endpoint_port_name]
action: keep
regex: metrics
- job_name: 'webgis-frontend'
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- webgis
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
action: keep
regex: webgis-frontend-service
- job_name: 'postgres-exporter'
static_configs:
- targets: ['postgres-exporter:9187']
- job_name: 'redis-exporter'
static_configs:
- targets: ['redis-exporter:9121']
- job_name: 'node-exporter'
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
target_label: __address__
replacement: '${1}:9100'
alert_rules.yml: |
groups:
- name: webgis.rules
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate detected"
description: "Error rate is {{ $value }} errors per second"
- alert: HighLatency
expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 0.5
for: 5m
labels:
severity: warning
annotations:
summary: "High latency detected"
description: "95th percentile latency is {{ $value }} seconds"
- alert: DatabaseConnectionsHigh
expr: pg_stat_database_numbackends / pg_settings_max_connections > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "Database connections high"
description: "Database connections are at {{ $value }}% of maximum"
- alert: RedisMemoryHigh
expr: redis_memory_used_bytes / redis_memory_max_bytes > 0.9
for: 5m
labels:
severity: critical
annotations:
summary: "Redis memory usage high"
description: "Redis memory usage is at {{ $value }}%"
- alert: PodCrashLooping
expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Pod is crash looping"
description: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is crash looping"
# monitoring/grafana-dashboards.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboards
namespace: monitoring
data:
webgis-overview.json: |
{
"dashboard": {
"id": null,
"title": "WebGIS Application Overview",
"tags": ["webgis"],
"style": "dark",
"timezone": "browser",
"panels": [
{
"id": 1,
"title": "Request Rate",
"type": "graph",
"targets": [
{
"expr": "rate(http_requests_total{job='webgis-api'}[5m])",
"legendFormat": "{{method}} {{path}}"
}
],
"yAxes": [
{
"label": "Requests/sec"
}
]
},
{
"id": 2,
"title": "Response Time",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket{job='webgis-api'}[5m]))",
"legendFormat": "95th percentile"
},
{
"expr": "histogram_quantile(0.50, rate(http_request_duration_seconds_bucket{job='webgis-api'}[5m]))",
"legendFormat": "50th percentile"
}
]
},
{
"id": 3,
"title": "Error Rate",
"type": "singlestat",
"targets": [
{
"expr": "rate(http_requests_total{job='webgis-api',status=~'5..'}[5m]) / rate(http_requests_total{job='webgis-api'}[5m])",
"legendFormat": "Error Rate"
}
],
"valueName": "current",
"format": "percentunit"
},
{
"id": 4,
"title": "Database Connections",
"type": "graph",
"targets": [
{
"expr": "pg_stat_database_numbackends",
"legendFormat": "Active Connections"
},
{
"expr": "pg_settings_max_connections",
"legendFormat": "Max Connections"
}
]
},
{
"id": 5,
"title": "Redis Memory Usage",
"type": "graph",
"targets": [
{
"expr": "redis_memory_used_bytes",
"legendFormat": "Used Memory"
},
{
"expr": "redis_memory_max_bytes",
"legendFormat": "Max Memory"
}
]
},
{
"id": 6,
"title": "Pod CPU Usage",
"type": "graph",
"targets": [
{
"expr": "rate(container_cpu_usage_seconds_total{namespace='webgis',container!='POD'}[5m])",
"legendFormat": "{{pod}} - {{container}}"
}
]
},
{
"id": 7,
"title": "Pod Memory Usage",
"type": "graph",
"targets": [
{
"expr": "container_memory_usage_bytes{namespace='webgis',container!='POD'}",
"legendFormat": "{{pod}} - {{container}}"
}
]
},
{
"id": 8,
"title": "Spatial Query Performance",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(spatial_query_duration_seconds_bucket[5m]))",
"legendFormat": "95th percentile"
}
]
}
],
"time": {
"from": "now-1h",
"to": "now"
},
"refresh": "30s"
}
}
16.7.2. Application Monitoring Implementation#
// src/monitoring/metrics.ts
import { register, Counter, Histogram, Gauge } from 'prom-client';
import { Request, Response, NextFunction } from 'express';
// HTTP Metrics
export const httpRequestsTotal = new Counter({
name: 'http_requests_total',
help: 'Total number of HTTP requests',
labelNames: ['method', 'path', 'status'],
});
export const httpRequestDuration = new Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['method', 'path', 'status'],
buckets: [0.1, 0.3, 0.5, 0.7, 1, 3, 5, 7, 10],
});
// Database Metrics
export const databaseQueryDuration = new Histogram({
name: 'database_query_duration_seconds',
help: 'Duration of database queries in seconds',
labelNames: ['query_type', 'table'],
buckets: [0.01, 0.05, 0.1, 0.3, 0.5, 1, 3, 5],
});
export const databaseConnectionsActive = new Gauge({
name: 'database_connections_active',
help: 'Number of active database connections',
});
// Spatial Metrics
export const spatialQueryDuration = new Histogram({
name: 'spatial_query_duration_seconds',
help: 'Duration of spatial queries in seconds',
labelNames: ['operation', 'geometry_type'],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10],
});
export const tilesGenerated = new Counter({
name: 'tiles_generated_total',
help: 'Total number of tiles generated',
labelNames: ['zoom_level', 'layer'],
});
export const tileCacheHits = new Counter({
name: 'tile_cache_hits_total',
help: 'Total number of tile cache hits',
labelNames: ['zoom_level', 'cache_type'],
});
// Application Metrics
export const activeUsers = new Gauge({
name: 'active_users',
help: 'Number of active users',
});
export const featureOperations = new Counter({
name: 'feature_operations_total',
help: 'Total number of feature operations',
labelNames: ['operation', 'feature_type'],
});
export const mapViewsTotal = new Counter({
name: 'map_views_total',
help: 'Total number of map views',
labelNames: ['zoom_level', 'region'],
});
// Middleware for HTTP metrics
export const metricsMiddleware = (req: Request, res: Response, next: NextFunction) => {
const start = Date.now();
res.on('finish', () => {
const duration = (Date.now() - start) / 1000;
const path = req.route?.path || req.path;
const method = req.method;
const status = res.statusCode.toString();
httpRequestsTotal.inc({ method, path, status });
httpRequestDuration.observe({ method, path, status }, duration);
});
next();
};
// Database monitoring
export class DatabaseMonitor {
private db: any;
constructor(database: any) {
this.db = database;
this.startConnectionMonitoring();
}
async monitorQuery<T>(
queryType: string,
table: string,
queryFunction: () => Promise<T>
): Promise<T> {
const start = Date.now();
try {
const result = await queryFunction();
const duration = (Date.now() - start) / 1000;
databaseQueryDuration.observe({ query_type: queryType, table }, duration);
return result;
} catch (error) {
const duration = (Date.now() - start) / 1000;
databaseQueryDuration.observe({ query_type: queryType, table }, duration);
throw error;
}
}
private startConnectionMonitoring(): void {
setInterval(async () => {
try {
const result = await this.db.query(
'SELECT count(*) FROM pg_stat_activity WHERE state = $1',
['active']
);
databaseConnectionsActive.set(parseInt(result.rows[0].count));
} catch (error) {
console.error('Failed to monitor database connections:', error);
}
}, 30000); // Every 30 seconds
}
}
// Spatial operations monitoring
export class SpatialMonitor {
static monitorSpatialQuery<T>(
operation: string,
geometryType: string,
queryFunction: () => Promise<T>
): Promise<T> {
const start = Date.now();
return queryFunction()
.then(result => {
const duration = (Date.now() - start) / 1000;
spatialQueryDuration.observe({ operation, geometry_type: geometryType }, duration);
return result;
})
.catch(error => {
const duration = (Date.now() - start) / 1000;
spatialQueryDuration.observe({ operation, geometry_type: geometryType }, duration);
throw error;
});
}
static recordTileGeneration(zoomLevel: number, layer: string): void {
tilesGenerated.inc({ zoom_level: zoomLevel.toString(), layer });
}
static recordTileCacheHit(zoomLevel: number, cacheType: string): void {
tileCacheHits.inc({ zoom_level: zoomLevel.toString(), cache_type: cacheType });
}
}
// User activity monitoring
export class UserActivityMonitor {
private activeUsersSessions = new Set<string>();
trackUserSession(sessionId: string): void {
this.activeUsersSessions.add(sessionId);
activeUsers.set(this.activeUsersSessions.size);
}
removeUserSession(sessionId: string): void {
this.activeUsersSessions.delete(sessionId);
activeUsers.set(this.activeUsersSessions.size);
}
recordFeatureOperation(operation: string, featureType: string): void {
featureOperations.inc({ operation, feature_type: featureType });
}
recordMapView(zoomLevel: number, region: string): void {
mapViewsTotal.inc({ zoom_level: zoomLevel.toString(), region });
}
// Cleanup inactive sessions
startSessionCleanup(): void {
setInterval(() => {
// Implementation would check for inactive sessions and remove them
// This is a simplified version
activeUsers.set(this.activeUsersSessions.size);
}, 60000); // Every minute
}
}
// Health check endpoint
export const healthCheck = async (req: Request, res: Response) => {
const health = {
status: 'healthy',
timestamp: new Date().toISOString(),
services: {
database: 'unknown',
redis: 'unknown',
storage: 'unknown'
}
};
try {
// Check database
await req.app.locals.db.query('SELECT 1');
health.services.database = 'healthy';
} catch {
health.services.database = 'unhealthy';
health.status = 'unhealthy';
}
try {
// Check Redis
await req.app.locals.redis.ping();
health.services.redis = 'healthy';
} catch {
health.services.redis = 'unhealthy';
health.status = 'unhealthy';
}
const statusCode = health.status === 'healthy' ? 200 : 503;
res.status(statusCode).json(health);
};
// Metrics endpoint
export const metricsEndpoint = (req: Request, res: Response) => {
res.set('Content-Type', register.contentType);
res.end(register.metrics());
};
// Initialize default metrics
register.clear();
16.8. Summary#
Modern deployment and DevOps practices for Web GIS applications require sophisticated orchestration of multiple specialized services, careful attention to geographic distribution, and robust monitoring of both traditional application metrics and geospatial-specific performance indicators.
Containerization with Docker provides the foundation for consistent deployments across environments while addressing the complex dependencies of geospatial services. Kubernetes orchestration enables scalable, resilient deployments with automated scaling based on both traditional metrics and spatial workload characteristics.
CI/CD pipelines automate the testing, building, and deployment process while incorporating geospatial-specific validation including spatial accuracy testing, performance benchmarking, and cross-browser visual validation for map rendering.
Infrastructure as Code with tools like Terraform ensures reproducible environments while properly configuring cloud services optimized for geospatial workloads including spatial databases, caching layers, and global content delivery networks.
Comprehensive monitoring and observability provide visibility into application performance, spatial query efficiency, and user experience metrics specific to mapping applications. This includes tracking tile generation performance, spatial query duration, and geographic distribution of usage patterns.
These deployment and operational practices enable Web GIS applications to scale reliably while maintaining the performance and availability requirements essential for interactive mapping experiences.
16.9. Exercises#
16.9.1. Exercise 16.1: Container Orchestration Setup#
Objective: Build a complete containerized deployment for a Web GIS application stack.
Instructions:
Multi-service containerization:
Create optimized Dockerfiles for API, frontend, database, and tile server
Implement health checks and proper signal handling
Configure security scanning and vulnerability management
Optimize image sizes and build times
Docker Compose orchestration:
Set up complete development environment with Docker Compose
Configure service dependencies and networking
Implement persistent storage and backup strategies
Add monitoring and logging services
Production optimization:
Create production-optimized container configurations
Implement multi-stage builds and security hardening
Configure resource limits and performance tuning
Add secrets management and environment configuration
Deliverable: Complete containerized deployment with development and production configurations.
16.9.2. Exercise 16.2: Kubernetes Deployment and Scaling#
Objective: Deploy and configure a Web GIS application on Kubernetes with auto-scaling.
Instructions:
Kubernetes manifests:
Create complete Kubernetes deployment manifests
Configure StatefulSets for databases and persistent services
Implement ConfigMaps and Secrets for configuration management
Set up Ingress controllers and load balancing
Auto-scaling configuration:
Implement Horizontal Pod Autoscaler with custom metrics
Configure cluster auto-scaling for node management
Set up resource quotas and limits
Add pod disruption budgets for high availability
Service mesh integration:
Implement service mesh for inter-service communication
Add traffic management and security policies
Configure observability and distributed tracing
Implement canary deployments and traffic splitting
Deliverable: Production-ready Kubernetes deployment with comprehensive scaling and management capabilities.
16.9.3. Exercise 16.3: CI/CD Pipeline Implementation#
Objective: Build comprehensive CI/CD pipelines for automated testing and deployment.
Instructions:
Automated testing pipeline:
Implement multi-stage testing including unit, integration, and E2E tests
Add security scanning and vulnerability assessment
Configure performance testing and benchmarking
Implement visual regression testing for map rendering
Build and deployment automation:
Create automated Docker image building and scanning
Implement artifact management and versioning
Configure automated deployment to staging and production
Add rollback capabilities and deployment validation
Quality gates and approvals:
Implement quality gates based on test coverage and performance
Configure manual approval workflows for production deployments
Add automated notifications and status reporting
Implement deployment monitoring and validation
Deliverable: Complete CI/CD pipeline with automated testing, building, and deployment capabilities.
16.9.4. Exercise 16.4: Infrastructure as Code#
Objective: Implement complete infrastructure automation using Terraform.
Instructions:
Cloud infrastructure provisioning:
Create Terraform modules for VPC, networking, and security groups
Provision managed database services optimized for spatial workloads
Set up container orchestration platforms and managed Kubernetes
Configure content delivery networks and edge locations
Application infrastructure:
Provision monitoring and logging infrastructure
Set up backup and disaster recovery systems
Configure auto-scaling groups and load balancers
Implement security scanning and compliance monitoring
Environment management:
Create reusable modules for different environments
Implement state management and remote backends
Add cost optimization and resource tagging
Configure drift detection and automated remediation
Deliverable: Complete Infrastructure as Code implementation with multi-environment support.
16.9.5. Exercise 16.5: Monitoring and Observability#
Objective: Implement comprehensive monitoring for Web GIS applications.
Instructions:
Application metrics:
Implement custom metrics for spatial operations and performance
Add user experience monitoring for map interactions
Configure database and cache performance monitoring
Create business metrics for feature usage and adoption
Infrastructure monitoring:
Set up resource utilization monitoring across the stack
Implement network performance and latency monitoring
Add security monitoring and threat detection
Configure capacity planning and scaling alerts
Observability platform:
Build comprehensive dashboards for different stakeholders
Implement alerting with proper escalation and notification
Add distributed tracing for complex request flows
Create automated reporting and analytics
Deliverable: Complete monitoring and observability platform with comprehensive coverage.
16.9.6. Exercise 16.6: High Availability and Disaster Recovery#
Objective: Implement high availability and disaster recovery for production systems.
Instructions:
High availability design:
Implement multi-region deployment for global availability
Configure database replication and failover mechanisms
Set up load balancing and traffic distribution
Add circuit breakers and graceful degradation
Disaster recovery planning:
Create automated backup and restore procedures
Implement cross-region data replication
Configure disaster recovery testing and validation
Add recovery time and recovery point objectives
Business continuity:
Implement service health checks and automatic failover
Configure maintenance mode and planned downtime procedures
Add incident response and communication plans
Create capacity planning for emergency scaling
Deliverable: Complete high availability and disaster recovery implementation with documented procedures.
16.9.7. Exercise 16.7: Performance Optimization and Scaling#
Objective: Optimize application performance and implement advanced scaling strategies.
Instructions:
Performance optimization:
Implement caching strategies at multiple layers
Optimize database queries and spatial operations
Configure CDN and edge computing for global performance
Add application-level performance monitoring and optimization
Scaling strategies:
Implement predictive scaling based on usage patterns
Configure geographic load distribution and routing
Add queue-based processing for heavy spatial operations
Implement database sharding and partitioning strategies
Cost optimization:
Configure spot instances and reserved capacity
Implement auto-scaling based on cost and performance metrics
Add resource right-sizing and optimization recommendations
Create cost monitoring and budget alerting
Deliverable: Highly optimized and efficiently scaling Web GIS application with comprehensive cost management.
Reflection Questions:
How do deployment requirements for Web GIS applications differ from traditional web applications?
What are the key considerations for scaling geospatial services globally?
How can monitoring be tailored to provide insights specific to mapping applications?
What are the critical components that need special attention in disaster recovery planning?