Introduction

Run docker compose up and your entire stack starts. Run docker compose down and it stops. That is the pitch.

The interesting parts are everything that goes wrong between those two commands. Networking gotchas where containers cannot find each other. Volume permissions that silently corrupt your database. Health checks that look right but do not actually prevent startup races. Most Compose tutorials spend too long on what services are and too little on the things that will actually bite you at 2am.

Docker Compose Fundamentals

Everything lives in a YAML file -- usually compose.yaml (the older docker-compose.yml name still works). Services, networks, volumes.

compose.yaml
services:
 web:
 image: nginx:alpineports:
 - "8080:80"api:
 build: ./apiports:
 - "3000:3000"depends_on:
 - dbdb:
 image: postgres:16environment:
 POSTGRES_DB: myappPOSTGRES_PASSWORD: secretvolumes:
 db_data:
networks:
 app_network:

Each service name doubles as a hostname. Your API connects to the database at db:5432. No IP addresses to look up. And this is the part most people miss at first -- the name you pick is not just a label, it is DNS. Name your service database instead of db and every connection string in your app changes. So pick names deliberately.

docker compose up creates networks, pulls or builds images, starts containers. docker compose down tears it down. Add -v to nuke volumes too.

Services and How They Talk to Each Other

Each service is one container. But a service is useless in isolation -- what matters is how services find each other on the network. By default, docker compose up creates one network for your entire project. Every service joins it. Every service can reach every other service by name.

For most projects, that default network is all you need. The problems start when it is not enough.

Using Pre-Built Images vs. Building from Source

compose.yaml - Service Definitions
services:
 frontend:
 build:
 context: ./frontenddockerfile: Dockerfileargs:
 NODE_ENV: developmentports:
 - "5173:5173"volumes:
 - ./frontend/src:/app/srccommand: npm run devbackend:
 build:
 context: ./backendtarget: developmentports:
 - "8000:8000"environment:
 DATABASE_URL: postgres://user:pass@db:5432/myappREDIS_URL: redis://cache:6379depends_on:
 - db
 - cachecache:
 image: redis:7-alpineports:
 - "6379:6379"restart: unless-stoppeddb:
 image: postgres:16-alpineports:
 - "5432:5432"volumes:
 - db_data:/var/lib/postgresql/dataenvironment:
 POSTGRES_USER: userPOSTGRES_PASSWORD: passPOSTGRES_DB: myappvolumes:
 db_data:

Standard stuff.

The backend service is more interesting. It targets a specific stage (development) in a multi-stage Dockerfile -- one Dockerfile across dev, test, and production, just select the stage you want. And look at the connection URLs: db:5432 and cache:6379. Service names are hostnames. But here is the gotcha that has bitten me: if you rename a service, every other service referencing it by hostname breaks silently. No error at compose time. The API just fails to connect at runtime.

restart: unless-stopped on the cache is the right default for most services. Docker restarts it after crashes but leaves it alone if you explicitly stop it.

Custom Networks

Your frontend should not be able to reach the database directly. Custom networks control which services can talk to each other. The idea: put the frontend and API on one network, put the API and database on another. The API sits on both, acting as the only bridge. The frontend tries to connect to db directly and gets nothing -- no route exists. That isolation matters more as your stack grows, because with dozens of microservices you do not want every container able to talk to every other container. Limits the blast radius when something breaks.

compose.yaml - Custom Networks
services:
 frontend:
 image: my-frontend:latestnetworks:
 - frontend_netports:
 - "3000:3000"api:
 image: my-api:latestnetworks:
 - frontend_net
 - backend_netports:
 - "8000:8000"db:
 image: postgres:16-alpinenetworks:
 - backend_netvolumes:
 - db_data:/var/lib/postgresql/dataredis:
 image: redis:7-alpinenetworks:
 - backend_netnetworks:
 frontend_net:
 driver: bridgebackend_net:
 driver: bridgevolumes:
 db_data:

Nothing surprising here.

One thing worth noting: the driver: bridge is the default and you can omit it. But being explicit helps when someone new reads the file. Or when you need to switch to overlay for Swarm later and want to see the diff clearly.

Volumes and Data Persistence

Containers are ephemeral. Your database data needs to survive restarts. Two types of mounts: named volumes (managed by Docker, persist across restarts) and bind mounts (map a host directory into the container, good for live code reloading).

compose.yaml - Volumes Configuration
services:
 app:
 build: .volumes:
 # Bind mount for live reloading in development
 - ./src:/app/src# Named volume for node_modules (avoids host conflicts)
 - node_modules:/app/node_modules# Named volume for uploaded files
 - uploads:/app/uploadsdb:
 image: postgres:16-alpinevolumes:
 # Named volume for persistent database storage
 - pgdata:/var/lib/postgresql/data# Bind mount for init scripts (runs once on first start)
 - ./db/init.sql:/docker-entrypoint-initdb.d/init.sqlgrafana:
 image: grafana/grafana:latestvolumes:
 # Named volume with custom driver options
 - grafana_storage:/var/lib/grafana# Read-only bind mount for provisioning
 - ./grafana/dashboards:/etc/grafana/provisioning/dashboards:rovolumes:
 pgdata:
 driver: localnode_modules:
 uploads:
 grafana_storage:

Watch the node_modules trick. If you bind-mount your entire project directory, the host's node_modules overwrites the container's copy. Native modules compiled for macOS end up inside a Linux container. Bad. A named volume for node_modules keeps the container's copy separate. This has bitten me more times than I want to admit.

The :ro flag on the Grafana mount makes it read-only inside the container. Good habit for config files.

And here is the volume permissions thing that nobody warns you about clearly enough: on Linux, the user inside the container and the user on the host are different UIDs. Your Postgres container runs as UID 999. Your bind-mounted directory is owned by UID 1000. Postgres cannot write to it. The fix is either matching UIDs in your Dockerfile or using named volumes (which Docker manages the permissions for). On macOS and Windows this is invisible because Docker Desktop handles the translation. So you develop happily on a Mac, deploy to Linux, and everything breaks. Every time.

docker compose down preserves named volumes by default. You need docker compose down -v to actually delete them. Intentional -- stops you from nuking your database during routine restarts.

Health Checks and Dependencies

depends_on without health check conditions is almost useless. It waits for the container to start, not for the service inside to be ready. Your API starts, tries to connect to Postgres, crashes because Postgres is still initializing. The container is running. The database is not.

compose.yaml - Health Checks
services:
 api:
 build: ./apiports:
 - "8000:8000"depends_on:
 db:
 condition: service_healthycache:
 condition: service_healthymigrations:
 condition: service_completed_successfullyhealthcheck:
 test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
 interval: 15stimeout: 5sretries: 3start_period: 10sdb:
 image: postgres:16-alpinehealthcheck:
 test: ["CMD-SHELL", "pg_isready -U postgres"]
 interval: 5stimeout: 3sretries: 5start_period: 15svolumes:
 - pgdata:/var/lib/postgresql/datacache:
 image: redis:7-alpinehealthcheck:
 test: ["CMD", "redis-cli", "ping"]
 interval: 5stimeout: 3sretries: 3migrations:
 build: ./apicommand: python manage.py migratedepends_on:
 db:
 condition: service_healthyvolumes:
 pgdata:

The config is self-documenting.

The migrations service is a one-shot container -- runs migrations, exits, and the API depends on it with service_completed_successfully. Your API always starts against a current schema. This pattern alone eliminates an entire class of "works on my machine" bugs.

One opinionated take: start_period is the most important timing parameter and the one people skip. Without it, a database that takes 10 seconds to initialize fails health checks during boot, gets marked unhealthy, and your dependent services never start. Set it generously. 15 seconds for Postgres, 30 for Elasticsearch. You lose nothing by being conservative here.

Environment Variables and Secrets

Someone on your team will hardcode production credentials into the Compose file. When, not if.

compose.yaml - Environment Management
services:
 api:
 build: ./apienvironment:
 # Inline variable with valueNODE_ENV: production# Variable interpolated from shell or .env fileDATABASE_URL: postgres://${DB_USER}:${DB_PASS}@db:5432/${DB_NAME}JWT_SECRET: ${JWT_SECRET}REDIS_URL: redis://cache:6379env_file:
 - .env
 - .env.localworker:
 build: ./workerenv_file:
 - .env
 - worker/.envenvironment:
 WORKER_CONCURRENCY: ${WORKER_CONCURRENCY:-4}db:
 image: postgres:16-alpineenvironment:
 POSTGRES_USER: ${DB_USER}POSTGRES_PASSWORD: ${DB_PASS}POSTGRES_DB: ${DB_NAME}

${VARIABLE:-default} provides a fallback. WORKER_CONCURRENCY defaults to 4 if not set anywhere.

.env
# Database configurationDB_USER=appuserDB_PASS=supersecretpasswordDB_NAME=myapp_production# AuthenticationJWT_SECRET=your-256-bit-secret-key-here# Worker settingsWORKER_CONCURRENCY=8# LoggingLOG_LEVEL=info

Three rules. Never commit .env files with real credentials. Commit an .env.example with placeholder values so new developers know what to set. Use env_file for shared variables, inline environment for service-specific overrides.

For production secrets, most teams inject through CI/CD at deployment time. Docker secrets exist (native in swarm mode) but the ergonomics are rough enough that few people bother outside of Swarm. HashiCorp Vault if you want to go all the way. But honestly, CI/CD environment variables cover 90% of cases and the remaining 10% probably means you need a real secrets manager anyway.

Profiles and Override Files

Not every service needs to run all the time. Profiles group optional services so they only start when you ask.

compose.yaml - Profiles
services:
 # Core services (always start)api:
 build: ./apiports:
 - "8000:8000"db:
 image: postgres:16-alpinevolumes:
 - pgdata:/var/lib/postgresql/data# Monitoring (only with --profile monitoring)prometheus:
 image: prom/prometheus:latestprofiles:
 - monitoringports:
 - "9090:9090"volumes:
 - ./prometheus.yml:/etc/prometheus/prometheus.yml:rografana:
 image: grafana/grafana:latestprofiles:
 - monitoringports:
 - "3001:3000"# Testing (only with --profile testing)selenium:
 image: selenium/standalone-chrome:latestprofiles:
 - testingports:
 - "4444:4444"mailhog:
 image: mailhog/mailhog:latestprofiles:
 - testing
 - developmentports:
 - "8025:8025"volumes:
 pgdata:

Services without a profiles key always start. Services with profiles only start when you request them:

docker compose --profile monitoring up starts core services plus monitoring. docker compose --profile monitoring --profile testing up starts everything. No profile flags? Only api and db come up.

Override files work differently. Compose reads compose.yaml first, then automatically merges compose.override.yaml if it exists. Base file stays production-like. Override adds bind mounts, debug ports, development commands. Skip the override in production with docker compose -f compose.yaml up.

You can layer multiple files: docker compose -f compose.yaml -f compose.staging.yaml up. Later files override earlier ones.

Things You Can Safely Ignore Until You Need Them

Add health checks with condition: service_healthy on every dependency. Not just the database. That single change eliminates most intermittent startup failures. And set up compose.override.yaml from day one -- base file stays production-like, dev conveniences go in the override.

Anurag Sinha

Anurag Sinha

Full Stack Developer & Technical Writer

Anurag is a full stack developer and technical writer. He covers web technologies, backend systems, and developer tools for the Codertronix community.