Are containers virtual machines?

No. VMs run a full guest OS with its own kernel; containers share the host kernel and just isolate processes via Linux namespaces and cgroups. This makes containers much lighter (start in milliseconds vs seconds, use MB of RAM vs GB) but they're less isolated than VMs. A bad container can affect host resources more than a bad VM. Modern hybrid approaches (Firecracker, Kata Containers) try to combine container UX with VM-level isolation.

What's the difference between Docker and Kubernetes?

Different layers. Docker is a tool to build, distribute, and run containers on a single machine. Kubernetes is an orchestration platform that runs containers across many machines, handles scheduling, networking, scaling, healing. Docker doesn't replace Kubernetes; Kubernetes typically uses container runtimes (formerly Docker Engine, now containerd) underneath. Today, you often build with Docker but deploy via Kubernetes.

Should I use Alpine or Debian-based images?

Default to Debian-slim (e.g., python:3.11-slim) unless you have specific size constraints. Alpine is much smaller (5MB vs 80MB base) but uses musl libc instead of glibc, which causes subtle compatibility issues with some libraries (especially Python's compiled extensions). Debug time saved with familiar tooling usually outweighs the disk space saved. Use Alpine when image size truly matters (edge deployments, very large fleets) and you've tested your stack works on it.

Docker and Containers Explained: How They Actually Work

Containers transformed software deployment over the past decade. They're now standard infrastructure for everything from local development to global production at FAANG scale. But most engineers using Docker daily don't understand how containers actually work — what makes a container different from a process, why image layers matter, what's really happening under the hood.

This post explains containers from first principles: the Linux primitives that make isolation possible, how Docker images and runtime work, and the operational considerations that determine whether containerized systems run smoothly.

What a container actually is

A container is a regular Linux process that the kernel has been told to treat differently — specifically, with restricted views of the filesystem, network, processes, and resources.

Three main Linux features make this possible:

Namespaces (isolation)

Linux namespaces give a process its own isolated view of system resources. The major namespaces:

PID namespace: process sees its own PID 1; can't see processes outside the namespace
Network namespace: process has its own network interfaces, IPs, routing table
Mount namespace: process has its own filesystem mount points
UTS namespace: separate hostname
User namespace: maps user IDs differently
IPC namespace: separate inter-process communication

A container is essentially a process running in its own set of namespaces. From inside, it appears to have its own machine.

cgroups (resource control)

Control groups limit and account for resources used by a process tree:

CPU shares and limits
Memory limits and OOM behavior
Block I/O bandwidth
Network bandwidth (with traffic control)

Without cgroups, a runaway container could consume all host resources.

Union/overlay filesystems (image layers)

OverlayFS (modern default) layers multiple filesystems together. The container sees a unified view; underlying layers are read-only and can be shared between containers.

This is why pulling a Docker image is fast — common base layers are already cached locally.

Container vs VM

Aspect	Container	VM
Isolation level	Process-level (kernel shared)	Hardware-level (separate kernel)
Startup time	Milliseconds	Seconds to minutes
Memory overhead	MB	GB
Disk overhead	MB-GB (shared layers)	GB
Density per host	100s-1000s	10s-100s
Boot guest OS	No (uses host kernel)	Yes
Security boundary	Weaker	Stronger

Containers are much lighter and faster. VMs provide stronger isolation. Modern security-focused approaches (Firecracker, Kata Containers) try to combine container performance with VM isolation.

Docker architecture

Docker has several components:

Docker Engine

The main daemon that manages containers and images.

Container runtime (containerd, runc)

Lower-level component that actually creates and runs containers. Docker historically wrapped runc directly; modern Docker uses containerd which uses runc.

Docker CLI

What you type at the terminal — sends commands to Docker Engine.

Docker Hub / Registry

Repository for images. Public (Docker Hub) or private (ECR, GCR, ACR, Harbor).

Docker images

An image is a read-only template containing:

Filesystem snapshot in layers
Image metadata (entrypoint, env vars, ports)
Configuration

Images are built from Dockerfiles:

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

EXPOSE 8000
CMD ["python", "app.py"]

Each instruction creates a new layer. Layers are content-addressed (hashed) and shared across images.

Layer caching matters

Order Dockerfile instructions from least to most frequently changing:

FROM python:3.11-slim
WORKDIR /app

COPY requirements.txt .       # Cache layer if unchanged
RUN pip install -r requirements.txt   # Cache invalidated only when requirements change

COPY . .                       # Code copies last; changes don't invalidate dependency install
CMD ["python", "app.py"]

This pattern saves significant build time during development.

Multi-stage builds

Reduce final image size by separating build and runtime stages:

FROM golang:1.21 AS builder
WORKDIR /src
COPY . .
RUN go build -o app

FROM alpine:3.18
COPY --from=builder /src/app /app
CMD ["/app"]

Build stage has Go compiler (large); final image has only the compiled binary (small).

Container networking

By default, containers run in their own network namespace with a virtual interface. Docker creates a bridge network connecting containers.

Modes:

Bridge (default): containers communicate via virtual bridge; port mapping for external access
Host: container shares host network namespace (no isolation)
None: no networking
Custom networks: user-defined; containers in same network can communicate by name

In production (Kubernetes), networking is much more sophisticated — CNI plugins, service meshes, network policies.

Container storage

By default, container filesystem is ephemeral. When container dies, data is lost. Persistent storage requires:

Volumes

Managed by Docker; persist outside container lifecycle. Best for important data.

Bind mounts

Map host directory into container. Useful for development (live code reload).

tmpfs mounts

In-memory only. For sensitive temp data.

For production, persistent volumes typically come from external storage (EBS, NFS, distributed filesystems).

Common Dockerfile best practices

Use specific base image tags

Bad: FROM python:latest (changes over time, unpredictable)
Good: FROM python:3.11.5-slim (specific, reproducible)

Run as non-root

RUN useradd -m appuser
USER appuser

Reduces blast radius of container escape.

Use .dockerignore

Exclude files from build context (node_modules, .git, etc). Speeds up builds; reduces image size.

One process per container

Containers should run a single primary process. Use orchestration (Kubernetes, Docker Compose) for multi-process workloads.

Health checks

HEALTHCHECK --interval=30s --timeout=3s \
  CMD curl -f http://localhost:8000/health || exit 1

Lets orchestrators detect unhealthy containers and replace them.

Minimize image size

Multi-stage builds
Slim or distroless base images
Combine RUN commands to reduce layer count
Clean apt/yum caches in same layer as install

Common mistakes

Running as root: container escape becomes more dangerous
No resource limits: a runaway container can exhaust host resources
Storing data in container filesystem: lost on restart; not what containers are for
Treating containers as VMs: SSHing in, modifying running containers, etc. Build new images and replace.
Embedding secrets in images: anyone who can pull the image gets the secret
Too many layers: deep image chains slow pulls and increase storage
Latest tags in production: unpredictable updates; pin specific versions
Ignoring image vulnerabilities: scan images regularly (Trivy, Snyk, Docker Scout)

Production considerations

Image registry

Don't depend on Docker Hub for production — rate limits, availability concerns. Use private registry (ECR, GCR, ACR, Harbor).

Image scanning

Scan for known vulnerabilities (CVEs) on every build. Block deploys if critical issues found.

Image signing

Sign images so production only runs verified images. Tools: Cosign, Notary.

Resource limits

Always set CPU and memory limits. Prevents one bad container from killing host.

Logging

Containers should log to stdout/stderr; orchestrator collects logs. Don't log to files inside containers.

Monitoring

Container-aware monitoring (cAdvisor, Datadog, Prometheus + node-exporter) tracks per-container resource usage.

When NOT to containerize

Performance-critical workloads with strict latency requirements: container overhead may matter
Stateful workloads with complex storage: orchestration is harder than for stateless services
Single-machine simple apps: containers add complexity for marginal benefit
Workloads requiring kernel modules or root host access: containers don't help

What to read next

System design basics — containers fit into broader architecture.
Microservices vs monolith — containers enable microservices.
Microservices observability — instrumenting containerized services.
Load balancers deep dive — routing to container fleets.

Containers aren't magic — they're a clever combination of Linux kernel features that enable isolation, packaging, and density that wasn't practical before. Understanding the underlying mechanics (namespaces, cgroups, union filesystems) helps debug containerized systems, optimize performance, and avoid common security pitfalls. They're now table stakes for modern infrastructure; learning them deeply is one of the best investments a backend engineer can make.