Skip to main content
Kubernetes (often abbreviated as K8s) is an open-source platform for automating the deployment, scaling, and management of containerized applications. Think of it as a system that runs your Docker containers across multiple servers, automatically handling load balancing, restarts, and scaling. While Docker runs containers on a single machine, Kubernetes manages containers across a cluster of machines. If a server fails, Kubernetes automatically moves your containers to healthy servers. If traffic increases, it can spin up more copies of your app. This makes Kubernetes ideal for production workloads that need high availability. This guide covers deploying Mizu applications with proper health checks, scaling, and best practices.

Basic Deployment

In Kubernetes, you don’t run containers directly. Instead, you describe what you want (e.g., “run 3 copies of my app”) in YAML files, and Kubernetes makes it happen. The two most fundamental resources are:
  • Deployment: Defines your application—what container image to use, how many replicas to run, and how to update them
  • Service: Exposes your application to network traffic, load balancing across all replicas

deployment.yaml

A Deployment tells Kubernetes how to run your application. The replicas: 3 means Kubernetes will always keep 3 copies running. If one crashes, it automatically starts a replacement.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  labels:
    app: myapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
        - name: myapp
          image: myregistry/myapp:v1.0.0
          ports:
            - containerPort: 3000
          env:
            - name: ENV
              value: "production"
          resources:
            requests:
              memory: "128Mi"
              cpu: "100m"
            limits:
              memory: "256Mi"
              cpu: "500m"

service.yaml

A Service provides a stable network endpoint for your application. Since pods (the smallest deployable units in Kubernetes) can be created and destroyed dynamically, their IP addresses change. A Service gives you a fixed address that automatically routes to healthy pods.
apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    app: myapp
  ports:
    - port: 80
      targetPort: 3000
  type: ClusterIP

Apply to Cluster

kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

# Check status
kubectl get pods -l app=myapp
kubectl get svc myapp

Health Probes

Kubernetes uses probes to monitor your application’s health and make intelligent decisions. Unlike Docker’s simple health checks, Kubernetes has three distinct probe types, each serving a different purpose. Understanding probes is critical: misconfigured probes can cause your app to restart constantly or receive traffic before it’s ready. Mizu’s built-in health handlers (/livez and /readyz) work perfectly with Kubernetes:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
        - name: myapp
          image: myregistry/myapp:v1.0.0
          ports:
            - containerPort: 3000

          # Liveness: Is the container running?
          # Failure triggers container restart
          livenessProbe:
            httpGet:
              path: /livez
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
            timeoutSeconds: 3
            failureThreshold: 3

          # Readiness: Can it handle traffic?
          # Failure removes pod from service endpoints
          readinessProbe:
            httpGet:
              path: /readyz
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5
            timeoutSeconds: 3
            failureThreshold: 3

          # Startup: Has the app finished starting?
          # Disables liveness/readiness until success
          startupProbe:
            httpGet:
              path: /readyz
              port: 3000
            initialDelaySeconds: 0
            periodSeconds: 2
            timeoutSeconds: 3
            failureThreshold: 30  # 30 * 2s = 60s max startup

Probe Types

ProbePurposeOn Failure
LivenessIs the container alive?Restart container
ReadinessCan it handle traffic?Remove from service
StartupHas it finished starting?Keep checking

Configuration

Kubernetes provides two resources for managing configuration separately from your container images. This separation is a best practice because it lets you use the same image in different environments (dev, staging, production) with different settings.

ConfigMap

A ConfigMap stores non-sensitive configuration data as key-value pairs. Your application reads these as environment variables or mounted files. ConfigMaps are visible to anyone with cluster access, so never store passwords here:
apiVersion: v1
kind: ConfigMap
metadata:
  name: myapp-config
data:
  LOG_LEVEL: "info"
  CACHE_TTL: "300"
  FEATURE_FLAGS: |
    {
      "new_dashboard": true,
      "beta_api": false
    }

Secret

A Secret is like a ConfigMap but designed for sensitive data like passwords, API keys, and certificates. Kubernetes stores Secrets with base64 encoding and can integrate with external secret managers for encryption at rest. While Secrets aren’t perfectly secure by default, they’re treated with more care than ConfigMaps and can be restricted with RBAC policies:
apiVersion: v1
kind: Secret
metadata:
  name: myapp-secrets
type: Opaque
stringData:
  DATABASE_URL: "postgres://user:pass@db:5432/myapp"
  API_KEY: "sk-secret-key"
Never commit secrets to git. Use sealed-secrets, external-secrets, or your cloud provider’s secret manager.

Using in Deployment

spec:
  containers:
    - name: myapp
      image: myregistry/myapp:v1.0.0

      # Environment from ConfigMap
      envFrom:
        - configMapRef:
            name: myapp-config

      # Individual secret references
      env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: myapp-secrets
              key: DATABASE_URL

      # Mount ConfigMap as file
      volumeMounts:
        - name: config
          mountPath: /etc/myapp
          readOnly: true

  volumes:
    - name: config
      configMap:
        name: myapp-config

Ingress

So far, your Service is only accessible inside the Kubernetes cluster. To expose your application to the internet, you need an Ingress. Think of Ingress as the front door to your cluster—it receives external traffic and routes it to the appropriate services based on the URL. Ingress also handles TLS/HTTPS termination, so you can configure SSL certificates in one place rather than in each service. You’ll need an Ingress Controller (like nginx-ingress or Traefik) installed in your cluster for Ingress resources to work.

Basic Ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: myapp
  annotations:
    kubernetes.io/ingress.class: nginx
spec:
  rules:
    - host: myapp.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: myapp
                port:
                  number: 80

Ingress with TLS

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: myapp
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
    - hosts:
        - myapp.example.com
      secretName: myapp-tls
  rules:
    - host: myapp.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: myapp
                port:
                  number: 80

Auto-Scaling

One of Kubernetes’ most powerful features is automatic scaling. Instead of manually adding or removing instances based on traffic, Kubernetes can do this automatically based on metrics like CPU usage, memory, or even custom metrics like requests per second.

Horizontal Pod Autoscaler

A Horizontal Pod Autoscaler (HPA) watches your pods and adjusts the replica count based on observed metrics. For example, if CPU usage exceeds 70%, it adds more pods. When usage drops, it scales back down to save resources. The HPA continuously monitors metrics and makes smooth adjustments—it won’t instantly scale from 2 to 100 pods, but will increase gradually to avoid thrashing:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

Custom Metrics

Scale based on request rate (requires metrics-server and prometheus-adapter):
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Pods
      pods:
        metric:
          name: http_requests_per_second
        target:
          type: AverageValue
          averageValue: "100"

Rolling Updates

When you deploy a new version of your application, you don’t want downtime. Rolling updates solve this by gradually replacing old pods with new ones. Kubernetes starts a new pod, waits for it to be healthy, then terminates an old pod. This continues until all pods are running the new version. If the new version has problems (failing health checks), Kubernetes stops the rollout and you can easily roll back to the previous version. This makes deployments much safer than traditional “stop everything, deploy, start everything” approaches.

Deployment Strategy

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Allow 1 extra pod during update
      maxUnavailable: 0  # Never have less than replicas
  template:
    spec:
      containers:
        - name: myapp
          image: myregistry/myapp:v1.0.0

          # Graceful shutdown
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 10"]

      # Wait for pod to be ready before continuing rollout
      terminationGracePeriodSeconds: 30

Perform Rolling Update

# Update image
kubectl set image deployment/myapp myapp=myregistry/myapp:v1.1.0

# Watch rollout
kubectl rollout status deployment/myapp

# Rollback if needed
kubectl rollout undo deployment/myapp

Resource Management

In a shared cluster, you need to control how much CPU and memory each application can use. Without limits, one misbehaving app could consume all resources and crash other apps. Kubernetes uses requests and limits to manage resource allocation.

Requests vs Limits

Requests tell Kubernetes the minimum resources your app needs. The scheduler uses this to decide which node to place your pod on—it won’t place a pod on a node that can’t satisfy its requests. Limits are the maximum resources your app can use. If your app tries to exceed its memory limit, Kubernetes kills it (OOMKilled). If it exceeds CPU limit, it gets throttled (slowed down but not killed).
resources:
  # Guaranteed resources (for scheduling)
  requests:
    memory: "128Mi"
    cpu: "100m"

  # Maximum resources (for throttling)
  limits:
    memory: "256Mi"
    cpu: "500m"
ResourceRequestLimit
CPUGuaranteed shareThrottled if exceeded
MemoryGuaranteed amountOOMKilled if exceeded

Resource Quotas

Limit resources per namespace:
apiVersion: v1
kind: ResourceQuota
metadata:
  name: myapp-quota
  namespace: myapp
spec:
  hard:
    requests.cpu: "4"
    requests.memory: "8Gi"
    limits.cpu: "8"
    limits.memory: "16Gi"
    pods: "20"

Pod Disruption Budget

Sometimes Kubernetes needs to evict pods for maintenance—node upgrades, cluster scaling, or rebalancing. A Pod Disruption Budget (PDB) tells Kubernetes the minimum number of pods that must stay running during these voluntary disruptions. For example, if you have 3 replicas and set minAvailable: 2, Kubernetes will only evict one pod at a time, ensuring at least 2 are always running. This prevents accidental outages during maintenance:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: myapp
spec:
  minAvailable: 2  # Or use maxUnavailable: 1
  selector:
    matchLabels:
      app: myapp

Namespace Organization

Namespaces are virtual clusters within your Kubernetes cluster. They let you divide resources between different teams, projects, or environments (dev, staging, production). Resources in different namespaces are isolated from each other, and you can apply different resource quotas and access controls to each namespace. A common pattern is to have separate namespaces for each environment:
apiVersion: v1
kind: Namespace
metadata:
  name: myapp-prod
  labels:
    env: production
---
apiVersion: v1
kind: Namespace
metadata:
  name: myapp-staging
  labels:
    env: staging
Deploy to namespace:
kubectl apply -f deployment.yaml -n myapp-prod

Complete Production Manifest

k8s/production.yaml:
---
apiVersion: v1
kind: Namespace
metadata:
  name: myapp
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: myapp-config
  namespace: myapp
data:
  LOG_LEVEL: "info"
---
apiVersion: v1
kind: Secret
metadata:
  name: myapp-secrets
  namespace: myapp
type: Opaque
stringData:
  DATABASE_URL: "postgres://user:pass@db:5432/myapp"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: myapp
  labels:
    app: myapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: myapp
    spec:
      serviceAccountName: myapp
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534
      containers:
        - name: myapp
          image: myregistry/myapp:v1.0.0
          imagePullPolicy: Always
          ports:
            - containerPort: 3000
              name: http
          envFrom:
            - configMapRef:
                name: myapp-config
          env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: myapp-secrets
                  key: DATABASE_URL
          resources:
            requests:
              memory: "128Mi"
              cpu: "100m"
            limits:
              memory: "256Mi"
              cpu: "500m"
          livenessProbe:
            httpGet:
              path: /livez
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /readyz
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5
          startupProbe:
            httpGet:
              path: /readyz
              port: 3000
            failureThreshold: 30
            periodSeconds: 2
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop:
                - ALL
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 10"]
      terminationGracePeriodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
  name: myapp
  namespace: myapp
spec:
  selector:
    app: myapp
  ports:
    - port: 80
      targetPort: 3000
      name: http
  type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: myapp
  namespace: myapp
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
    - hosts:
        - myapp.example.com
      secretName: myapp-tls
  rules:
    - host: myapp.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: myapp
                port:
                  number: 80
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp
  namespace: myapp
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: myapp
  namespace: myapp
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: myapp

Useful Commands

# View pods
kubectl get pods -n myapp -o wide

# View logs
kubectl logs -f deployment/myapp -n myapp

# Execute in pod
kubectl exec -it deployment/myapp -n myapp -- /bin/sh

# Describe pod (troubleshooting)
kubectl describe pod <pod-name> -n myapp

# Port forward for local testing
kubectl port-forward svc/myapp 3000:80 -n myapp

# View events
kubectl get events -n myapp --sort-by='.lastTimestamp'

# Scale manually
kubectl scale deployment/myapp --replicas=5 -n myapp

Next Steps