A rookie mistake
Consider the following function checkEverySecondWithTimeout()
:
/**
* Calls <code>checkMethod</code> every 1 second.
* Returns when the check passes or the <code>checkTimeOut</code>
* elapses, whichever is earlier.
* <p/>
* e.g, <code>checkEverySecondWithTimeout(() -> false, 5)</code> will return false after 5 seconds.
*
* @param checkMethod A no-arg function that returns true to indicate pass; false otherwise
* @param checkTimeOut Maximum number of seconds for which to execute the checks
* @return true if the check passes, false if the timeout elapses
* @throws InterruptedException
*/
public static boolean checkEverySecondWithTimeout(BooleanSupplier checkMethod, int checkTimeOut) throws InterruptedException {
for (int i = 0; i < checkTimeOut; i++) {
if (checkMethod.getAsBoolean()) {
return true;
}
TimeUnit.SECONDS.sleep(1);
}
return false;
}
Can you spot the error in this function?
It makes the rookie mistake of assuming that sleep()
is reliable. The call TimeUnit.SECONDS.sleep(1)
will certainly put the thread to sleep for a second, that much we can rely upon. The problem is that the sleep can extend to beyond one second; perhaps 2 seconds on an autumn day and perhaps 5 seconds on a full moon night.
Essentially, there are no guarantees that the duration of sleep will be exact or even within a tolerance. This makes the checkTimeOut
a lower bound at best and a long lasting siesta at worst.
A recent incident at work was traced in part to a function that, in essence, made the same mistake. It ran fine for eight months in production before failing. Since I was on support during this incident and helped diagnose the root cause, I naturally went back to see who wrote the code and when.
No prizes for guessing who it was ◔̯◔