Skip to content
Jarviix

Tech · 7 min read

Docker and Containers Explained: How They Actually Work

Containers transformed software deployment, but most engineers don't understand how they work under the hood. Namespaces, cgroups, layered filesystems, and what 'container' really means.

By Jarviix Engineering · Apr 19, 2026

Container shipping yard
Photo via Unsplash

Containers transformed software deployment over the past decade. They're now standard infrastructure for everything from local development to global production at FAANG scale. But most engineers using Docker daily don't understand how containers actually work — what makes a container different from a process, why image layers matter, what's really happening under the hood.

This post explains containers from first principles: the Linux primitives that make isolation possible, how Docker images and runtime work, and the operational considerations that determine whether containerized systems run smoothly.

What a container actually is

A container is a regular Linux process that the kernel has been told to treat differently — specifically, with restricted views of the filesystem, network, processes, and resources.

Three main Linux features make this possible:

Namespaces (isolation)

Linux namespaces give a process its own isolated view of system resources. The major namespaces:

  • PID namespace: process sees its own PID 1; can't see processes outside the namespace
  • Network namespace: process has its own network interfaces, IPs, routing table
  • Mount namespace: process has its own filesystem mount points
  • UTS namespace: separate hostname
  • User namespace: maps user IDs differently
  • IPC namespace: separate inter-process communication

A container is essentially a process running in its own set of namespaces. From inside, it appears to have its own machine.

cgroups (resource control)

Control groups limit and account for resources used by a process tree:

  • CPU shares and limits
  • Memory limits and OOM behavior
  • Block I/O bandwidth
  • Network bandwidth (with traffic control)

Without cgroups, a runaway container could consume all host resources.

Union/overlay filesystems (image layers)

OverlayFS (modern default) layers multiple filesystems together. The container sees a unified view; underlying layers are read-only and can be shared between containers.

This is why pulling a Docker image is fast — common base layers are already cached locally.

Container vs VM

Aspect Container VM
Isolation level Process-level (kernel shared) Hardware-level (separate kernel)
Startup time Milliseconds Seconds to minutes
Memory overhead MB GB
Disk overhead MB-GB (shared layers) GB
Density per host 100s-1000s 10s-100s
Boot guest OS No (uses host kernel) Yes
Security boundary Weaker Stronger

Containers are much lighter and faster. VMs provide stronger isolation. Modern security-focused approaches (Firecracker, Kata Containers) try to combine container performance with VM isolation.

Docker architecture

Docker has several components:

Docker Engine

The main daemon that manages containers and images.

Container runtime (containerd, runc)

Lower-level component that actually creates and runs containers. Docker historically wrapped runc directly; modern Docker uses containerd which uses runc.

Docker CLI

What you type at the terminal — sends commands to Docker Engine.

Docker Hub / Registry

Repository for images. Public (Docker Hub) or private (ECR, GCR, ACR, Harbor).

Docker images

An image is a read-only template containing:

  • Filesystem snapshot in layers
  • Image metadata (entrypoint, env vars, ports)
  • Configuration

Images are built from Dockerfiles:

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

EXPOSE 8000
CMD ["python", "app.py"]

Each instruction creates a new layer. Layers are content-addressed (hashed) and shared across images.

Layer caching matters

Order Dockerfile instructions from least to most frequently changing:

FROM python:3.11-slim
WORKDIR /app

COPY requirements.txt .       # Cache layer if unchanged
RUN pip install -r requirements.txt   # Cache invalidated only when requirements change

COPY . .                       # Code copies last; changes don't invalidate dependency install
CMD ["python", "app.py"]

This pattern saves significant build time during development.

Multi-stage builds

Reduce final image size by separating build and runtime stages:

FROM golang:1.21 AS builder
WORKDIR /src
COPY . .
RUN go build -o app

FROM alpine:3.18
COPY --from=builder /src/app /app
CMD ["/app"]

Build stage has Go compiler (large); final image has only the compiled binary (small).

Container networking

By default, containers run in their own network namespace with a virtual interface. Docker creates a bridge network connecting containers.

Modes:

  • Bridge (default): containers communicate via virtual bridge; port mapping for external access
  • Host: container shares host network namespace (no isolation)
  • None: no networking
  • Custom networks: user-defined; containers in same network can communicate by name

In production (Kubernetes), networking is much more sophisticated — CNI plugins, service meshes, network policies.

Container storage

By default, container filesystem is ephemeral. When container dies, data is lost. Persistent storage requires:

Volumes

Managed by Docker; persist outside container lifecycle. Best for important data.

Bind mounts

Map host directory into container. Useful for development (live code reload).

tmpfs mounts

In-memory only. For sensitive temp data.

For production, persistent volumes typically come from external storage (EBS, NFS, distributed filesystems).

Common Dockerfile best practices

Use specific base image tags

  • Bad: FROM python:latest (changes over time, unpredictable)
  • Good: FROM python:3.11.5-slim (specific, reproducible)

Run as non-root

RUN useradd -m appuser
USER appuser

Reduces blast radius of container escape.

Use .dockerignore

Exclude files from build context (node_modules, .git, etc). Speeds up builds; reduces image size.

One process per container

Containers should run a single primary process. Use orchestration (Kubernetes, Docker Compose) for multi-process workloads.

Health checks

HEALTHCHECK --interval=30s --timeout=3s \
  CMD curl -f http://localhost:8000/health || exit 1

Lets orchestrators detect unhealthy containers and replace them.

Minimize image size

  • Multi-stage builds
  • Slim or distroless base images
  • Combine RUN commands to reduce layer count
  • Clean apt/yum caches in same layer as install

Common mistakes

  • Running as root: container escape becomes more dangerous
  • No resource limits: a runaway container can exhaust host resources
  • Storing data in container filesystem: lost on restart; not what containers are for
  • Treating containers as VMs: SSHing in, modifying running containers, etc. Build new images and replace.
  • Embedding secrets in images: anyone who can pull the image gets the secret
  • Too many layers: deep image chains slow pulls and increase storage
  • Latest tags in production: unpredictable updates; pin specific versions
  • Ignoring image vulnerabilities: scan images regularly (Trivy, Snyk, Docker Scout)

Production considerations

Image registry

Don't depend on Docker Hub for production — rate limits, availability concerns. Use private registry (ECR, GCR, ACR, Harbor).

Image scanning

Scan for known vulnerabilities (CVEs) on every build. Block deploys if critical issues found.

Image signing

Sign images so production only runs verified images. Tools: Cosign, Notary.

Resource limits

Always set CPU and memory limits. Prevents one bad container from killing host.

Logging

Containers should log to stdout/stderr; orchestrator collects logs. Don't log to files inside containers.

Monitoring

Container-aware monitoring (cAdvisor, Datadog, Prometheus + node-exporter) tracks per-container resource usage.

When NOT to containerize

  • Performance-critical workloads with strict latency requirements: container overhead may matter
  • Stateful workloads with complex storage: orchestration is harder than for stateless services
  • Single-machine simple apps: containers add complexity for marginal benefit
  • Workloads requiring kernel modules or root host access: containers don't help

Containers aren't magic — they're a clever combination of Linux kernel features that enable isolation, packaging, and density that wasn't practical before. Understanding the underlying mechanics (namespaces, cgroups, union filesystems) helps debug containerized systems, optimize performance, and avoid common security pitfalls. They're now table stakes for modern infrastructure; learning them deeply is one of the best investments a backend engineer can make.

Frequently asked questions

Are containers virtual machines?

No. VMs run a full guest OS with its own kernel; containers share the host kernel and just isolate processes via Linux namespaces and cgroups. This makes containers much lighter (start in milliseconds vs seconds, use MB of RAM vs GB) but they're less isolated than VMs. A bad container can affect host resources more than a bad VM. Modern hybrid approaches (Firecracker, Kata Containers) try to combine container UX with VM-level isolation.

What's the difference between Docker and Kubernetes?

Different layers. Docker is a tool to build, distribute, and run containers on a single machine. Kubernetes is an orchestration platform that runs containers across many machines, handles scheduling, networking, scaling, healing. Docker doesn't replace Kubernetes; Kubernetes typically uses container runtimes (formerly Docker Engine, now containerd) underneath. Today, you often build with Docker but deploy via Kubernetes.

Should I use Alpine or Debian-based images?

Default to Debian-slim (e.g., python:3.11-slim) unless you have specific size constraints. Alpine is much smaller (5MB vs 80MB base) but uses musl libc instead of glibc, which causes subtle compatibility issues with some libraries (especially Python's compiled extensions). Debug time saved with familiar tooling usually outweighs the disk space saved. Use Alpine when image size truly matters (edge deployments, very large fleets) and you've tested your stack works on it.

Related Jarviix tools

Read paired with the calculator that does the math.

Read next