Tech · 7 min read
Java Multithreading: A Working Engineer's Guide to Threads, Executors and the JMM
What thread, runnable, executor, future, completable future, virtual thread, volatile, synchronized, and the Java Memory Model actually mean — with the trade-offs that decide which one to reach for in production.
By Jarviix Engineering · Apr 17, 2026
Java's concurrency story has more layers than almost any other language's. There's the original Thread and Runnable from JDK 1.0, the java.util.concurrent library from JDK 5, CompletableFuture from JDK 8, the structured-concurrency and virtual-threads work that landed in 21, and a Memory Model that quietly underwrites all of it.
Most of the time, you don't need all of it. This guide walks through the layers in the order you'd actually adopt them, with the trade-offs that decide which one a real codebase reaches for.
The mental model: threads, scheduling, and shared state
A thread is the smallest unit of execution the OS can schedule. The JVM maps each Thread to an OS thread (until Loom; we'll get there). Threads share memory — that's both the entire point and the entire problem.
Three things go wrong when threads share state:
- Race conditions. Two threads read-modify-write the same field; one update gets lost.
- Visibility. Thread A writes a field; thread B never sees the new value because it cached the old one in a register or CPU cache.
- Reordering. The compiler or CPU reorders independent-looking instructions; another thread observes the reordered result and the invariant breaks.
Almost every concurrency bug is one of these three. Almost every concurrency tool is a different way to prevent them.
Layer 1: Thread and Runnable
The original API. Runnable is "a thing to run"; Thread is "an OS thread to run it on".
Thread t = new Thread(() -> {
System.out.println("running on " + Thread.currentThread().getName());
});
t.start();
t.join();
You'll see this in tutorials. You'll almost never write it in production code. The reasons are simple:
- Creating and destroying OS threads is expensive.
- You have nowhere to put the result of the work.
- You have nowhere to put exceptions thrown inside the thread.
- You have no easy way to limit how many threads run concurrently.
All of those are solved by the executor framework.
Layer 2: ExecutorService and the thread pool
ExecutorService is a pool of worker threads that you submit work to. The pool reuses threads — you pay the thread-creation cost once, then amortize it.
ExecutorService pool = Executors.newFixedThreadPool(8);
Future<Integer> result = pool.submit(() -> {
return slowComputation();
});
int value = result.get(); // blocks until done
pool.shutdown();
submit() returns a Future, which lets you wait for the result, cancel the work, or check if it's done. Exceptions thrown inside the task are wrapped in an ExecutionException and re-thrown when you call get() — much harder to lose.
Sizing the pool is the part most teams get wrong. A useful starting point:
- CPU-bound work: roughly one thread per CPU core (
Runtime.getRuntime().availableProcessors()). More threads just trash caches. - IO-bound work: many more threads than cores (32, 64, 256+) — most are blocked waiting on the network anyway.
If you find yourself wanting one pool per kind of work (one for IO, one for CPU, one for scheduled jobs), trust that instinct. Mixing them is the path to a thread-starvation outage on a Friday night.
Layer 3: CompletableFuture and async pipelines
Future lets you wait on a result. CompletableFuture lets you chain results without ever blocking.
CompletableFuture
.supplyAsync(() -> userRepo.findById(id))
.thenCompose(user -> CompletableFuture.supplyAsync(() -> billingApi.lookup(user)))
.thenApply(BillingSummary::netDue)
.exceptionally(ex -> {
log.warn("billing lookup failed", ex);
return BigDecimal.ZERO;
});
Each step runs on the executor (default: ForkJoinPool.commonPool(), which you should usually override). Failures propagate down the chain like a rejected promise in JS. You only block when you actually need the value (.join()), and even then, you usually shouldn't — you let it be the body of an HTTP response and let the framework take care of writing it.
This API is the workhorse of modern Java concurrency. Spend an afternoon learning thenCompose, thenCombine, allOf, and exceptionally properly — they replace 90% of hand-rolled callback code.
Layer 4: virtual threads (Project Loom)
Released stable in Java 21. A virtual thread looks identical to a regular thread from your code's point of view — same Thread class, same start(), same join() — but it isn't pinned to an OS thread. Thousands or millions of them can run on a small handful of OS carriers.
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
for (int i = 0; i < 100_000; i++) {
executor.submit(() -> doRequest());
}
}
For a typical request-per-thread backend that spends most of its time waiting on databases or HTTP calls, this changes the math entirely. You stop needing carefully-sized pools. You stop needing reactive frameworks just to handle thousands of concurrent connections. You write blocking, linear code, and the runtime makes it cheap.
Caveats that don't go away:
- Virtual threads don't help CPU-bound work. They share carriers; if every task wants the CPU, you still need to sit in a queue.
- Shared mutable state is still shared mutable state. Your locks, atomics, and memory model still apply.
- Some old code that pins threads (e.g.
synchronizedblocks holding the carrier across blocking IO, or native code that uses thread-locals as identity) can defeat the runtime's ability to park the virtual thread cheaply. The JDK is fixing these one by one.
The Java Memory Model in one paragraph
The JMM is the contract that says "if you do X, the JVM guarantees Y" across threads. Two rules cover most cases:
- A write to a
volatilefield is happens-before any subsequent read of the same field from any thread. - A successful release of a
synchronizedlock is happens-before any subsequent acquisition of the same lock.
Everything in java.util.concurrent is built on these primitives. AtomicInteger.incrementAndGet() works because the CAS instruction it compiles to has a release/acquire pair baked in. ConcurrentHashMap works because every published reference goes through one of these gates. You almost never have to think about the JMM directly — but knowing it's there is what lets you reason about why volatile boolean running is enough for a stop-flag, and why a plain boolean is not.
When to reach for what
A short cheat sheet:
| Situation | Reach for |
|---|---|
| One-off background task in a script | new Thread(() -> {...}).start() is fine |
| Bounded pool of similar tasks | ExecutorService (newFixedThreadPool) |
| Pipeline of dependent async steps | CompletableFuture |
| Modern web request handler doing IO | virtual threads (newVirtualThreadPerTaskExecutor) |
| Single shared counter | AtomicInteger / AtomicLong |
| Single shared boolean visibility | volatile boolean |
| Compound update across multiple fields | synchronized block, or a Lock |
| High-throughput shared map | ConcurrentHashMap |
| Task that must finish before another starts | CountDownLatch or structured concurrency |
| Producer/consumer hand-off in-process | BlockingQueue (e.g. LinkedBlockingQueue) |
What to avoid
A few patterns to actively unlearn:
Thread.sleep()to "wait for" something. Use aCountDownLatch, aFuture, orawaiton a condition. Sleeping is a guess.- Catching and swallowing
InterruptedException. Either rethrow it, or set the interrupt flag back (Thread.currentThread().interrupt()). Otherwise you make your code uncancellable. - Holding a lock across an IO call. Anything that can block for milliseconds-to-seconds while a lock is held is a future incident. Acquire late, release fast.
- Using
synchronizedeverywhere "to be safe". It's correct, but it serializes work and blocks virtual-thread carriers in older JDKs. Reach for the smallest tool that gives you the guarantee you need.
Where to go from here
Concurrency is the part of backend engineering with the longest tail. Don't try to learn all of it at once. The shortlist that pays for itself fastest:
- Be fluent in
ExecutorServiceandCompletableFuture. - Understand
volatile,synchronized, and the basic happens-before rules. - Try virtual threads on a real workload — the difference in code shape is the easiest way to internalize them.
If you want structured practice in this style, the DSA hub and the System Design Basics post both build on the same mental model: pick the smallest tool that solves the problem, and be honest about what it costs.
Frequently asked questions
Should I still learn `Thread` and `Runnable` in 2026?
Yes — but mostly as background. In real code, you'll almost always be working with `ExecutorService`, `CompletableFuture`, or virtual threads. Knowing the primitives makes the higher-level APIs make sense.
Are virtual threads a silver bullet?
For request-per-thread style backends doing IO, virtual threads (Project Loom) are a genuinely huge win — they remove the executor-pool sizing dance. They are not magic for CPU-bound work, and they don't fix shared-mutable-state bugs.
When do I need `volatile` vs `synchronized` vs an `AtomicX`?
`volatile` for visibility of a single variable across threads. `synchronized` for compound operations (read-modify-write, multiple variables, an invariant). `AtomicX` for lock-free single-variable updates with stronger guarantees than `volatile`.
Read next
Apr 19, 2026 · 6 min read
JVM Garbage Collection: G1, ZGC, and How to Pick One Without Reading 800 Pages of Spec
What the JVM garbage collectors actually do, the trade-offs between G1, ZGC, Shenandoah, and Parallel, and how to pick one for the workload you're actually running.
Apr 19, 2026 · 6 min read
Spring Boot Best Practices for Production
Spring Boot accelerates Java backend development but ships with defaults that need tuning for production. Configuration, observability, performance, and security practices that matter.
Apr 19, 2026 · 6 min read
API Versioning Strategies: URL, Header, and the Trade-offs Nobody Tells You
URL versioning, header versioning, content negotiation, and 'no versioning at all' — what each costs, what each gets you, and how to pick a strategy you won't regret in three years.