Introduction

You probably don't need Kubernetes. If your app runs on one server and handles your traffic fine, stop reading. Come back when it doesn't.

Still here? Good. Your Docker Compose setup works until a server dies at 2 AM and you spend the weekend recovering containers by hand. K8s prevents that specific weekend. It runs your containers across multiple machines, keeps them healthy, routes traffic, and replaces anything that crashes. Originally Google's, now maintained by the CNCF. Overkill for a side project. Not overkill when you are running multiple services that need to stay up without someone babysitting them.

Core Concepts: Pods, Nodes and Clusters

A cluster is the set of machines running under a Kubernetes control plane. The control plane schedules workloads, monitors health, and converges actual state toward declared state. A node is one machine in that cluster -- control plane nodes handle management, worker nodes run your containers. On EKS, GKE, or AKS, the cloud provider manages the control plane. You manage workers. A pod is the smallest deployable unit: one or more containers sharing a network namespace and storage. Most pods hold one container. Multi-container pods are for sidecars.

pod.yaml
apiVersion: v1
kind: Pod
metadata:
 name: my-nginxlabels:
 app: nginxenvironment: devspec:
 containers:
 - name: nginximage: nginx:1.27-alpineports:
 - containerPort: 80

Labels are key-value pairs that wire everything together. Services find pods by labels. Deployments manage pods by labels. Ignore labels and nothing connects to anything.

Copy-paste these. Seriously.

bash
# Create the pod from your YAML file
kubectl apply -f pod.yaml
# Check that the pod is running
kubectl get pods
# See detailed information about the pod
kubectl describe pod my-nginx

You will almost never create standalone pods in production. They are disposable. Crash? Node failure? The pod is gone. Kubernetes does not restart standalone pods.

Deployments and ReplicaSets

The resource you actually interact with day to day. A Deployment tells Kubernetes: run this many copies of this container, keep them healthy, and roll out updates without downtime. Under the hood it manages a ReplicaSet, which manages pods. You never touch ReplicaSets directly.

deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
 name: node-apilabels:
 app: node-apispec:
 replicas: 3selector:
 matchLabels:
 app: node-apitemplate:
 metadata:
 labels:
 app: node-apispec:
 containers:
 - name: apiimage: myregistry/node-api:2.1.0ports:
 - containerPort: 3000readinessProbe:
 httpGet:
 path: /healthport: 3000initialDelaySeconds: 5periodSeconds: 10livenessProbe:
 httpGet:
 path: /healthport: 3000initialDelaySeconds: 15periodSeconds: 20

replicas: 3 means three instances running at all times. Crash? Replacement starts immediately. But this is also K8s hiding your bugs. A memory leak that OOMKills a container every few hours? Kubernetes restarts it so fast that users never notice. The underlying bug goes undiagnosed for months. Be aware of that tradeoff.

selector.matchLabels tells the Deployment which pods it owns. Labels between selector and template must match or Kubernetes rejects the manifest.

Readiness and liveness probes. Non-negotiable in production. Readiness gates when a pod can receive traffic -- without it, rolling updates send requests to containers that haven't finished booting. Liveness tells Kubernetes when a pod is stuck and should be killed. Without probes, K8s has no way to tell if your app is healthy or silently broken.

Rolling Updates

Change the image tag. Apply the manifest. Kubernetes gradually swaps old pods for new. Zero downtime. maxUnavailable controls how many pods can be down at once. maxSurge controls how many extra pods can spin up. Set maxUnavailable: 1 on three replicas and at least two always serve traffic.

Deploy goes wrong? Roll back:

bash
# Check the rollout history
kubectl rollout history deployment/node-api
# Roll back to the previous version
kubectl rollout undo deployment/node-api
# Roll back to a specific revision
kubectl rollout undo deployment/node-api --to-revision=2
# Watch the rollout in real time
kubectl rollout status deployment/node-api

Built in. No additional tooling. This alone justifies the complexity over manual Docker Compose deploys.

Services and Networking

Pods get IP addresses but those addresses are disposable. Pod restarts? New IP. Hard-code a pod IP and things break immediately.

A Service gives you a stable virtual IP and DNS name that routes traffic to pods matched by label selectors. Three types: ClusterIP is internal-only -- service-to-service communication, like your API talking to your database. That is what ClusterIP actually means: reachable inside the cluster, invisible outside it. NodePort exposes the app on a static port on every node -- fine for development, not production. LoadBalancer provisions a real cloud load balancer. That is how you expose services to the internet.

service.yaml
apiVersion: v1
kind: Service
metadata:
 name: node-api-servicespec:
 type: ClusterIPselector:
 app: node-apiports:
 - protocol: TCPport: 80targetPort: 3000

selector matches pod labels from the Deployment. port is what the Service listens on. targetPort is the container port. Other pods reach the API at http://node-api-service:80 via Kubernetes DNS.

No IP addresses to track. Reference services by name. One of the cleaner parts of the whole system, honestly.

ConfigMaps and Secrets

Do not bake config into images. Dev, staging, production -- different database URLs, different feature flags, different API keys. ConfigMaps hold non-sensitive data. Secrets hold sensitive data (poorly, but we will get to that).

ConfigMaps

Key-value stores injected into pods as env vars or mounted as files:

configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
 name: api-configdata:
 NODE_ENV: "production"LOG_LEVEL: "info"MAX_CONNECTIONS: "100"CACHE_TTL: "3600"
---
apiVersion: v1
kind: Secret
metadata:
 name: api-secretstype: Opaquedata:
 # Values must be base64-encodedDATABASE_URL: cG9zdGdyZXM6Ly91c2VyOnBhc3NAZGItaG9zdDo1NDMyL215ZGI=API_KEY: c3VwZXItc2VjcmV0LWFwaS1rZXktMTIzNDU=

Important: Kubernetes Secrets are base64-encoded, not encrypted. Anyone with cluster access can decode them. For real encryption at rest, enable EncryptionConfiguration on the API server or use something like HashiCorp Vault, AWS Secrets Manager, or the Sealed Secrets controller.

Inject via environment variables or volume mounts. Env vars for simple key-value pairs. Volumes for config files. Change a ConfigMap without rebuilding the image -- mounted volumes update automatically. Env vars need a pod restart, but kubectl rollout restart deployment/node-api takes seconds.

Ingress and External Access

Ten microservices. Ten LoadBalancer services. Ten cloud load balancers, each with its own bill and public IP. Terrible.

Ingress gives you one entry point. Routing by hostname or URL path. An Ingress Controller (NGINX, Traefik, AWS ALB) implements the actual routing rules.

ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
 name: app-ingressannotations:
 nginx.ingress.kubernetes.io/rewrite-target: /cert-manager.io/cluster-issuer: letsencrypt-prodspec:
 ingressClassName: nginxtls:
 - hosts:
 - api.myapp.com
 - app.myapp.comsecretName: myapp-tlsrules:
 - host: api.myapp.comhttp:
 paths:
 - path: /pathType: Prefixbackend:
 service:
 name: node-api-serviceport:
 number: 80
 - host: app.myapp.comhttp:
 paths:
 - path: /pathType: Prefixbackend:
 service:
 name: frontend-serviceport:
 number: 80

One Ingress. Two services. Traffic splits by hostname. api.myapp.com hits the API. app.myapp.com hits the frontend.

The tls section handles HTTPS. Pair with cert-manager and you get free Let's Encrypt certificates that auto-renew. Path-based routing within a single host works too -- /api/* to one service, /docs/* to another. Most teams overthink ingress configuration when the YAML above covers 80% of real use cases.

Scaling and Resource Management

Autoscaling only works if you tell Kubernetes how much CPU and memory your containers need. Skip resource definitions and the scheduler is guessing.

Resource Requests and Limits

Request = minimum guaranteed. Limit = maximum allowed. Exceed memory limit? Container killed. Exceed CPU limit? Throttled.

Nobody gets these right on the first try. Requests too high, you waste cluster capacity. Limits too low, containers die under normal load. We once set 256Mi on a service that needed 300Mi at peak -- the OOMKills were intermittent enough that nobody connected the dots for days. Deploy generous first. Watch real usage. Then tighten. Starting too tight is always worse than starting too loose.

Horizontal Pod Autoscaler (HPA)

Watches metrics. Adjusts replica count.

hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
 name: node-api-hpaspec:
 scaleTargetRef:
 apiVersion: apps/v1kind: Deploymentname: node-apiminReplicas: 2maxReplicas: 10metrics:
 - type: Resourceresource:
 name: cputarget:
 type: UtilizationaverageUtilization: 70
 - type: Resourceresource:
 name: memorytarget:
 type: UtilizationaverageUtilization: 80behavior:
 scaleDown:
 stabilizationWindowSeconds: 300policies:
 - type: Percentvalue: 25periodSeconds: 60

Between 2 and 10 replicas. CPU above 70% or memory above 80% triggers scale-out. The behavior section prevents flapping -- the autoscaler bouncing wildly between up and down. Stabilization window waits 5 minutes. Scale-down capped at 25% per minute.

Prerequisites: Metrics Server installed (default on most managed K8s) and resource requests defined on your containers. The HPA calculates utilization relative to those requests. No requests? The HPA has nothing to work with.

Debugging Running Applications

This is the section that actually matters for your day-to-day. Containers crash. Pods refuse to start. Services go dark. Knowing how to figure out why is the difference between a 5-minute fix and a 2-hour panic.

kubectl logs

First stop. Always.

bash
# View logs for a specific pod
kubectl logs node-api-7d4b8c6f9-x2k4m
# Stream logs in real time (like tail -f)
kubectl logs -f node-api-7d4b8c6f9-x2k4m
# View logs from a previous crashed container
kubectl logs node-api-7d4b8c6f9-x2k4m --previous
# View logs from all pods matching a label
kubectl logs -l app=node-api --all-containers=true
# Show only the last 100 lines
kubectl logs node-api-7d4b8c6f9-x2k4m --tail=100
# Show logs from the last 30 minutes
kubectl logs node-api-7d4b8c6f9-x2k4m --since=30m

--previous is the one that saves you. Container in CrashLoopBackOff? That flag shows logs from the last run before the crash. The actual error message. Usually that is all you need.

kubectl describe

When logs aren't enough. Shows events, conditions, configuration for any resource. A pod that won't start at all won't have logs. But kubectl describe will show you exactly why in the Events section at the bottom. "Image not found." "Insufficient CPU." "Failed to mount volume." The answer is almost always there.

kubectl exec

Get inside a running container. Check files, test connectivity, run diagnostics.

kubectl exec -it pod-name -- /bin/sh for a shell. kubectl exec pod-name -- cat /etc/config/app.json for a single command. Add -c container-name for multi-container pods.

kubectl port-forward

Access a pod or service from your machine without exposing it. kubectl port-forward service/node-api-service 8080:80, then http://localhost:8080. Better than creating throwaway NodePort services for every debugging session.

Common Debugging Workflow

When something breaks:

  1. kubectl get pods -- look for CrashLoopBackOff, ImagePullBackOff, or Pending.
  2. kubectl describe pod <pod-name> on anything unhealthy. Check the Events section.
  3. Pod started but misbehaving? kubectl logs <pod-name>.
  4. Pod crashed? kubectl logs <pod-name> --previous.
  5. Need to investigate further? kubectl exec -it <pod-name> -- /bin/sh.
  6. Networking issue? kubectl port-forward to test connectivity, and check that Service selectors match pod labels.

Image pull errors. Config mistakes. Resource limits. Selector mismatches. That is 90% of Kubernetes problems, and this workflow covers all of them.

Conclusion

Start with a managed cluster, learn kubectl logs and kubectl describe, and resist the urge to write Helm charts until your third project.

Anurag Sinha

Anurag Sinha

Full Stack Developer & Technical Writer

Anurag is a full stack developer and technical writer. He covers web technologies, backend systems, and developer tools for the Codertronix community.