Blog Logo

Handling Failure in Long Running Processes

In this article, we explore strategies for dealing with failure in long running processes in distributed systems. We discuss the use of retries, delayed retries, and exponential backoff strategies in the context of the Fan Courier Gateway. We also delve into the importance of message idempotency when retrying and implementing compensating transactions, as well as using timeouts to trigger mitigating actions when the primary API is down. Ultimately, the goal is to ensure that long running processes do not get into an inconsistent state if something fails along the way.