Another weekend, another weekend read, this time all about Idempotence.
When we talk about distributed systems and failure tolerance, the conversation will quickly arrive at idempotence. Idempotence is often invoked as a key ingredient in failure tolerant computations: When your process crashes, retry-just make sure the individual steps are idempotent.
Idempotence
Idempotence refers to the property of an operation that the repeated application of the operation is equivalent to a single application of the operation
a • a • a ≡ a
We rely on idempotence to ensure correctness in the event of failure and subsequent recovery. If the steps of a process are idempotent, we expect the process to be correct after a crash and a subsequent retry.
a • b • ⚡️ • a • b • c ≡ a • b • c
The Pitfalls of Idempotence
However, the reality is more complicated. Let’s explore the following python snippet:
def transfer(idempotenceKey, source, target, amount):
balance = getBalance(idempotenceKey, source)
if balance >= amount:
setBalance(idempotenceKey, source, -amount)
setBalance(idempotenceKey, target, +amount)
Operations getBalance
and setBalance
are idempotent and will always return the same result for the same idempotenceKey
and (idempotenceKey, account)
pair.
We want to guarantee that we will never overdraft an account:
always ∀ a ∈ Account: balance(a) > 0
Example
For this example, we assume that there is a single worker processing requests one by one in the order they are received. The worker listens on an at-least-once delivery message queue like AWS SQS.
In case the worker crashes, a monitor will restart the server. If the worker dequeues a message but fails to acknowledge before crashing, the message becomes visible again after a timeout, potentially leading to out-of-order processing when the worker restarts.
Counterexample
Unfortunately, idempotence will not safe us in this situation. Let’s look at an example. Let’s assume there are two accounts, A and B with $100 initial balance. We’ll initiate two transfers. If there is no failure we are okay. However, what happens if the worker crashes after getting the balance?
As we can see, in this case we violate our correctness guarantee even though every single step is idempotent.
Conclusion
Idempotence is a desirable property in distributed systems. Idempotent operations simplify the development of correct systems-However, idempotence is not a silver bullet.
Happy Reading