Retry Until Success

Implementation-grounded distributed systems: concurrency, data, and reliability, explained in code you can run.

Tracing a Request Through a Distributed System

March 27, 2026 distributed-systemsobservabilityjava

When something goes wrong in a distributed system, the hardest part isn’t fixing it, it’s understanding what happened. This post walks through a technique for tracing a single request across multiple services.

Welcome to Retry Until Success

March 26, 2026 intro

Most things worth doing fail the first time. Systems, ideas, the grasp of a hard problem, they all iterate toward something that works, or they don’t. This blog is about the iteration, mostly as it shows up in distributed systems.