Kubernetes (often abbreviated as K8s) is an open-source platform for automating the deployment, scaling, and management of containerized applications. Think of it as a system that runs your Docker containers across multiple servers, automatically handling load balancing, restarts, and scaling.
While Docker runs containers on a single machine, Kubernetes manages containers across a cluster of machines. If a server fails, Kubernetes automatically moves your containers to healthy servers. If traffic increases, it can spin up more copies of your app. This makes Kubernetes ideal for production workloads that need high availability.
This guide covers deploying Mizu applications with proper health checks, scaling, and best practices.
Basic Deployment
In Kubernetes, you don’t run containers directly. Instead, you describe what you want (e.g., “run 3 copies of my app”) in YAML files, and Kubernetes makes it happen. The two most fundamental resources are:
- Deployment: Defines your application—what container image to use, how many replicas to run, and how to update them
- Service: Exposes your application to network traffic, load balancing across all replicas
deployment.yaml
A Deployment tells Kubernetes how to run your application. The replicas: 3 means Kubernetes will always keep 3 copies running. If one crashes, it automatically starts a replacement.
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
labels:
app: myapp
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myregistry/myapp:v1.0.0
ports:
- containerPort: 3000
env:
- name: ENV
value: "production"
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
service.yaml
A Service provides a stable network endpoint for your application. Since pods (the smallest deployable units in Kubernetes) can be created and destroyed dynamically, their IP addresses change. A Service gives you a fixed address that automatically routes to healthy pods.
apiVersion: v1
kind: Service
metadata:
name: myapp
spec:
selector:
app: myapp
ports:
- port: 80
targetPort: 3000
type: ClusterIP
Apply to Cluster
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
# Check status
kubectl get pods -l app=myapp
kubectl get svc myapp
Health Probes
Kubernetes uses probes to monitor your application’s health and make intelligent decisions. Unlike Docker’s simple health checks, Kubernetes has three distinct probe types, each serving a different purpose.
Understanding probes is critical: misconfigured probes can cause your app to restart constantly or receive traffic before it’s ready. Mizu’s built-in health handlers (/livez and /readyz) work perfectly with Kubernetes:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myregistry/myapp:v1.0.0
ports:
- containerPort: 3000
# Liveness: Is the container running?
# Failure triggers container restart
livenessProbe:
httpGet:
path: /livez
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
# Readiness: Can it handle traffic?
# Failure removes pod from service endpoints
readinessProbe:
httpGet:
path: /readyz
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
# Startup: Has the app finished starting?
# Disables liveness/readiness until success
startupProbe:
httpGet:
path: /readyz
port: 3000
initialDelaySeconds: 0
periodSeconds: 2
timeoutSeconds: 3
failureThreshold: 30 # 30 * 2s = 60s max startup
Probe Types
| Probe | Purpose | On Failure |
|---|
| Liveness | Is the container alive? | Restart container |
| Readiness | Can it handle traffic? | Remove from service |
| Startup | Has it finished starting? | Keep checking |
Configuration
Kubernetes provides two resources for managing configuration separately from your container images. This separation is a best practice because it lets you use the same image in different environments (dev, staging, production) with different settings.
ConfigMap
A ConfigMap stores non-sensitive configuration data as key-value pairs. Your application reads these as environment variables or mounted files. ConfigMaps are visible to anyone with cluster access, so never store passwords here:
apiVersion: v1
kind: ConfigMap
metadata:
name: myapp-config
data:
LOG_LEVEL: "info"
CACHE_TTL: "300"
FEATURE_FLAGS: |
{
"new_dashboard": true,
"beta_api": false
}
Secret
A Secret is like a ConfigMap but designed for sensitive data like passwords, API keys, and certificates. Kubernetes stores Secrets with base64 encoding and can integrate with external secret managers for encryption at rest. While Secrets aren’t perfectly secure by default, they’re treated with more care than ConfigMaps and can be restricted with RBAC policies:
apiVersion: v1
kind: Secret
metadata:
name: myapp-secrets
type: Opaque
stringData:
DATABASE_URL: "postgres://user:pass@db:5432/myapp"
API_KEY: "sk-secret-key"
Never commit secrets to git. Use sealed-secrets, external-secrets, or your cloud provider’s secret manager.
Using in Deployment
spec:
containers:
- name: myapp
image: myregistry/myapp:v1.0.0
# Environment from ConfigMap
envFrom:
- configMapRef:
name: myapp-config
# Individual secret references
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: myapp-secrets
key: DATABASE_URL
# Mount ConfigMap as file
volumeMounts:
- name: config
mountPath: /etc/myapp
readOnly: true
volumes:
- name: config
configMap:
name: myapp-config
Ingress
So far, your Service is only accessible inside the Kubernetes cluster. To expose your application to the internet, you need an Ingress. Think of Ingress as the front door to your cluster—it receives external traffic and routes it to the appropriate services based on the URL.
Ingress also handles TLS/HTTPS termination, so you can configure SSL certificates in one place rather than in each service. You’ll need an Ingress Controller (like nginx-ingress or Traefik) installed in your cluster for Ingress resources to work.
Basic Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myapp
annotations:
kubernetes.io/ingress.class: nginx
spec:
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp
port:
number: 80
Ingress with TLS
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myapp
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- myapp.example.com
secretName: myapp-tls
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp
port:
number: 80
Auto-Scaling
One of Kubernetes’ most powerful features is automatic scaling. Instead of manually adding or removing instances based on traffic, Kubernetes can do this automatically based on metrics like CPU usage, memory, or even custom metrics like requests per second.
Horizontal Pod Autoscaler
A Horizontal Pod Autoscaler (HPA) watches your pods and adjusts the replica count based on observed metrics. For example, if CPU usage exceeds 70%, it adds more pods. When usage drops, it scales back down to save resources.
The HPA continuously monitors metrics and makes smooth adjustments—it won’t instantly scale from 2 to 100 pods, but will increase gradually to avoid thrashing:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Custom Metrics
Scale based on request rate (requires metrics-server and prometheus-adapter):
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 20
metrics:
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "100"
Rolling Updates
When you deploy a new version of your application, you don’t want downtime. Rolling updates solve this by gradually replacing old pods with new ones. Kubernetes starts a new pod, waits for it to be healthy, then terminates an old pod. This continues until all pods are running the new version.
If the new version has problems (failing health checks), Kubernetes stops the rollout and you can easily roll back to the previous version. This makes deployments much safer than traditional “stop everything, deploy, start everything” approaches.
Deployment Strategy
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # Allow 1 extra pod during update
maxUnavailable: 0 # Never have less than replicas
template:
spec:
containers:
- name: myapp
image: myregistry/myapp:v1.0.0
# Graceful shutdown
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 10"]
# Wait for pod to be ready before continuing rollout
terminationGracePeriodSeconds: 30
# Update image
kubectl set image deployment/myapp myapp=myregistry/myapp:v1.1.0
# Watch rollout
kubectl rollout status deployment/myapp
# Rollback if needed
kubectl rollout undo deployment/myapp
Resource Management
In a shared cluster, you need to control how much CPU and memory each application can use. Without limits, one misbehaving app could consume all resources and crash other apps. Kubernetes uses requests and limits to manage resource allocation.
Requests vs Limits
Requests tell Kubernetes the minimum resources your app needs. The scheduler uses this to decide which node to place your pod on—it won’t place a pod on a node that can’t satisfy its requests.
Limits are the maximum resources your app can use. If your app tries to exceed its memory limit, Kubernetes kills it (OOMKilled). If it exceeds CPU limit, it gets throttled (slowed down but not killed).
resources:
# Guaranteed resources (for scheduling)
requests:
memory: "128Mi"
cpu: "100m"
# Maximum resources (for throttling)
limits:
memory: "256Mi"
cpu: "500m"
| Resource | Request | Limit |
|---|
| CPU | Guaranteed share | Throttled if exceeded |
| Memory | Guaranteed amount | OOMKilled if exceeded |
Resource Quotas
Limit resources per namespace:
apiVersion: v1
kind: ResourceQuota
metadata:
name: myapp-quota
namespace: myapp
spec:
hard:
requests.cpu: "4"
requests.memory: "8Gi"
limits.cpu: "8"
limits.memory: "16Gi"
pods: "20"
Pod Disruption Budget
Sometimes Kubernetes needs to evict pods for maintenance—node upgrades, cluster scaling, or rebalancing. A Pod Disruption Budget (PDB) tells Kubernetes the minimum number of pods that must stay running during these voluntary disruptions.
For example, if you have 3 replicas and set minAvailable: 2, Kubernetes will only evict one pod at a time, ensuring at least 2 are always running. This prevents accidental outages during maintenance:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: myapp
spec:
minAvailable: 2 # Or use maxUnavailable: 1
selector:
matchLabels:
app: myapp
Namespace Organization
Namespaces are virtual clusters within your Kubernetes cluster. They let you divide resources between different teams, projects, or environments (dev, staging, production). Resources in different namespaces are isolated from each other, and you can apply different resource quotas and access controls to each namespace.
A common pattern is to have separate namespaces for each environment:
apiVersion: v1
kind: Namespace
metadata:
name: myapp-prod
labels:
env: production
---
apiVersion: v1
kind: Namespace
metadata:
name: myapp-staging
labels:
env: staging
Deploy to namespace:
kubectl apply -f deployment.yaml -n myapp-prod
Complete Production Manifest
k8s/production.yaml:
---
apiVersion: v1
kind: Namespace
metadata:
name: myapp
---
apiVersion: v1
kind: ConfigMap
metadata:
name: myapp-config
namespace: myapp
data:
LOG_LEVEL: "info"
---
apiVersion: v1
kind: Secret
metadata:
name: myapp-secrets
namespace: myapp
type: Opaque
stringData:
DATABASE_URL: "postgres://user:pass@db:5432/myapp"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: myapp
labels:
app: myapp
spec:
replicas: 3
selector:
matchLabels:
app: myapp
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: myapp
spec:
serviceAccountName: myapp
securityContext:
runAsNonRoot: true
runAsUser: 65534
containers:
- name: myapp
image: myregistry/myapp:v1.0.0
imagePullPolicy: Always
ports:
- containerPort: 3000
name: http
envFrom:
- configMapRef:
name: myapp-config
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: myapp-secrets
key: DATABASE_URL
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /livez
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /readyz
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
startupProbe:
httpGet:
path: /readyz
port: 3000
failureThreshold: 30
periodSeconds: 2
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 10"]
terminationGracePeriodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
name: myapp
namespace: myapp
spec:
selector:
app: myapp
ports:
- port: 80
targetPort: 3000
name: http
type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myapp
namespace: myapp
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- myapp.example.com
secretName: myapp-tls
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp
port:
number: 80
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp
namespace: myapp
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: myapp
namespace: myapp
spec:
minAvailable: 2
selector:
matchLabels:
app: myapp
Useful Commands
# View pods
kubectl get pods -n myapp -o wide
# View logs
kubectl logs -f deployment/myapp -n myapp
# Execute in pod
kubectl exec -it deployment/myapp -n myapp -- /bin/sh
# Describe pod (troubleshooting)
kubectl describe pod <pod-name> -n myapp
# Port forward for local testing
kubectl port-forward svc/myapp 3000:80 -n myapp
# View events
kubectl get events -n myapp --sort-by='.lastTimestamp'
# Scale manually
kubectl scale deployment/myapp --replicas=5 -n myapp
Next Steps