A rookie mistake

Consider the following function checkEverySecondWithTimeout():

/**
 * Calls <code>checkMethod</code> every 1 second.
 * Returns when the check passes or the <code>checkTimeOut</code>
 * elapses, whichever is earlier.
 * <p/>
 * e.g, <code>checkEverySecondWithTimeout(() -> false, 5)</code> will return false after 5 seconds.
 *
 * @param checkMethod  A no-arg function that returns true to indicate pass; false otherwise
 * @param checkTimeOut Maximum number of seconds for which to execute the checks
 * @return true if the check passes, false if the timeout elapses
 * @throws InterruptedException
 */
public static boolean checkEverySecondWithTimeout(BooleanSupplier checkMethod, int checkTimeOut) throws InterruptedException {
    for (int i = 0; i < checkTimeOut; i++) {
        if (checkMethod.getAsBoolean()) {
            return true;
        }
        TimeUnit.SECONDS.sleep(1);
    }
    return false;
}

Can you spot the error in this function?

It makes the rookie mistake of assuming that sleep() is reliable. The call TimeUnit.SECONDS.sleep(1) will certainly put the thread to sleep for a second, that much we can rely upon. The problem is that the sleep can extend to beyond one second; perhaps 2 seconds on an autumn day and perhaps 5 seconds on a full moon night.

Essentially, there are no guarantees that the duration of sleep will be exact or even within a tolerance. This makes the checkTimeOut a lower bound at best and a long lasting siesta at worst.

A recent incident at work was traced in part to a function that, in essence, made the same mistake. It ran fine for eight months in production before failing. Since I was on support during this incident and helped diagnose the root cause, I naturally went back to see who wrote the code and when.

No prizes for guessing who it was ◔̯◔

Streams in Java 8

Last year, Oracle released Java 8, the most significant upgrade to the language since Java 5. One of the headline features in Java 8 is the introduction of the new Streams API.

This blog post is an attempt to wrap my head around this new feature by way of an example.

Consider the Even Fibonacci numbers problem from Project Euler:

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:
1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...
By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.

A traditional Java solution to this problem would be to create an iterator that generates a sequence of Fibonacci numbers and then iterate through them, adding up the even numbers till we hit an element exceeding four million. Here's an implementation:

import java.util.Iterator;

public class FibonacciSequence implements Iterator<Integer> {

    private Integer previous;
    private Integer current;

    public FibonacciSequence() {
        this.previous = -1;
        this.current = 1;
    }

    @Override
    public Integer next() {
        Integer next = previous + current;
        previous = current;
        current = next;
        return next;
    }

    @Override
    public boolean hasNext() {
        return true;
    }

    public static void main(String[] args) {
        final int LIMIT = 4_000_000;
        final FibonacciSequence sequence = new FibonacciSequence();

        Integer sum = 0;

        while (sequence.hasNext()) {
            Integer value = sequence.next();
            if (value > LIMIT) {
                break;
            }
            if (value % 2 == 0) {
                sum += value;
            }
        }

        System.out.println("Sum using Iterator = " + sum);
    }
}

This can be made a bit nicer using the enhanced for-each loop syntax from Java 5, which requires that our FibonacciSequence class implement the Iterable interface:

public class FibonacciSequence implements Iterator<Integer>, Iterable<Integer>  {

    // Class implementation as in the previous code snippet

    @Override
    public Iterator<Integer> iterator() {
        return this;
    }

    public static void main(String[] args) {
        final int LIMIT = 4_000_000;
        final FibonacciSequence sequence = new FibonacciSequence();

        Integer sum = 0;

        for (Integer value : sequence) {
            if (value > LIMIT) {
                break;
            }
            if (value % 2 == 0) {
                sum += value;
            }
        }

        System.out.println("Sum using Iterable = " + sum);
    }
}

Which brings us to the Java 8 Streams API. Ideally, a stream style API should allow us to solve this problem using something like this:

int sum = infiniteFibonacciNumbersStream
        .takeWhile(i -> i <= 4_000_000)
        .filter(i -> i % 2 == 0)
        .sum();

We start with an infinite stream of fibonacci numbers, consume elements from it till the predicate (i <= 4_000_000) becomes false, filter the elements of the resulting finite stream for even numbers (i % 2 == 0) and finally, compute the sum of the filtered stream.

Unfortunately, the Java 8 stream API has no equivalent of takeWhile. As far as I can tell, the API offers no way of limiting our stream to a finite one. To work around this limitation, we will push the limit into the FibonacciSequence class, making it a finite sequence:

public class FibonacciSequence implements Iterator<Integer>, Iterable<Integer> {

    private final Integer max;
    private Integer previous;
    private Integer current;

    public FibonacciSequence(int max) {
        this.previous = -1;
        this.current = 1;
        this.max = max;
    }

    @Override
    public Integer next() {
        Integer next = previous + current;
        previous = current;
        current = next;
        return next;
    }

    @Override
    public boolean hasNext() {
        return current <= max;
    }

    @Override
    public Iterator<Integer> iterator() {
        return this;
    }

    public static void main(String[] args) {
        final int LIMIT = 4_000_000;
        final FibonacciSequence sequence = new FibonacciSequence(LIMIT);
        Integer sum = 0;

        for (Integer value : sequence) {
            if (value % 2 == 0) {
                sum += value;
            }
        }

        System.out.println("Sum using finite Iterable = " + sum);
    }    
}

Now we need a way to convert the FibonacciSequence into a stream. The default interface implementation (another Java 8 feature) of the new Collection.stream() method gives us a hint:

/**
* Returns a sequential {@code Stream} with this collection as its source.
*
* @since 1.8
*/
default Stream<E> stream() {
    return StreamSupport.stream(spliterator(), false);
}

The spliterator is the stream equivalent of the iterator. It allows the iterator to be potentially split into multiple iterators so that it can be consumed in parallel via parallel streams. The second parameter, false, provided to StreamSupport.stream() indicates that this stream is not designed for parallel consumption and the returned stream is a sequential stream.

The Java 8 Iterable interface, which our FibonacciSequence implements, now provides a default implementation of spliterator():

/**
* Creates a {@link Spliterator} over the elements described by this
* {@code Iterable}.
*
* @since 1.8
*/
default Spliterator<T> spliterator() {
    return Spliterators.spliteratorUnknownSize(iterator(), 0);
}

So putting these pieces together, we can obtain a finite, sequential Stream of fibonacci numbers using:

FibonacciSequence fibonacciSequence = new FibonacciSequence(LIMIT);
Stream<Integer> fibStream = StreamSupport.stream(fibonacciSequence.spliterator(), false);

With the stream in hand, the only steps that remain are to filter the stream for even numbers and then compute the sum.

int sum = fibStream
        .filter(i -> i % 2 == 0)
        .mapToInt(i -> i)
        .sum();

The mapToInt() operation is needed because the stream API's sum() operation can only work with IntStreams, a specialization of Stream for primitive integers. The mapToInt() operation converts the source stream into an IntStream using the supplied function.

Putting it all together:

final int LIMIT = 4_000_000;
FibonacciSequence fibonacciSequence = new FibonacciSequence(4_000_000);
Stream<Integer> fibStream = StreamSupport.stream(fibonacciSequence.spliterator(), false);
int sum = fibStream
        .filter(i -> i % 2 == 0)
        .mapToInt(i -> i)
        .sum();
System.out.println("Sum by stream computation is: " + sum);

So that's the stream equivalent of our original iterative solution.

Now we also know that Fibonacci sequences can be generated using recursion. Can we solve the fibonacci problem using a recursive stream generation solution? Turns out, it can be done! Here's how:

public Stream<Integer> recursiveFibStream(int a, int b, int max) {
    Integer current = a + b, prev = b;
    if (current <= max) {
        return Stream.concat(Stream.of(current), recursiveFibStream(prev, current, max));
    } else {
        return Stream.empty();
    }
}

public int usingRecursiveStream() {
    int sum = recursiveFibStream(0, 1, 4_000_000)
            .filter(i -> i % 2 == 0)
            .mapToInt(i -> i)
            .sum();

    return sum;
}

Nifty? Sure. But it runs like a snail!

I setup a JMH benchmark of these four different implementation strategies and here are the results:

Benchmark                      Mode      Cnt  Score   Error  Units
Runner.usingIterable         sample  2192211  0.393 ± 0.009  us/op
Runner.usingIterator         sample  2204423  0.383 ± 0.005  us/op
Runner.usingRecursiveStream  sample  3810503  6.579 ± 0.054  us/op
Runner.usingStream           sample  2530036  0.529 ± 0.020  us/op

The Stream API does have some overhead over the conventional iterator/iterable based implementations but the clear outlier is the recursive stream implementation which takes 6.6 microseconds to compute the sum which the others can compute in less than half a microsecond!

I've shared the benchmarking code on Bitbucket: https://bitbucket.org/antrix/java8-streams-example. Do let me know if you find a mistake either in my implementation or the benchmark setup.

Python for the busy Java Developer

For the past few months, I've been working on a book titled, Python for the busy Java Developer. I'm happy to say that it is finally complete and ready to share. Go, download it now!

Python for the busy Java Developer book cover

The book's page has all the details on the audience and the content of the book. So let me use this post to talk a bit about the experience of writing it.

It is a relatively short book and yet, the amount of effort that it took to finish it was immense. My respect for authors has gone through the roof! It is one thing to have a collection of topics to write about and quite a different thing to put all of those topics together into a cohesive narrative. I learned a great deal about structuring content, striking the balance between too little and too much detail and writing with the reader in mind. I also learned a few new things about Python along the way!

Beyond the content, creating a book involves publishing it in a format suitable for wider consumption. Traditionally, this is the bit that is handled by book publishers. However, since I didn't intend to approach a publisher, I had to figure out how to produce the book myself.

Initially, I started composing the book using the Markdown format and generating output using the Pandoc tool. However, I soon hit the limits of what Markdown is good for. There are several elements like admonitions and callouts in code samples that can't be expressed in Markdown. At this point, Hitesh pointed me to the AsciiDoc format, specifically the Asciidoctor project. This was a revelation!

The AsciiDoc markup format supports pretty much anything you would want to do when writing a book. The Asciidoctor toolchain makes it easy to produce great looking output from AsciiDoc source. There are still a few bugs and gaps but I am confident that in six months time, this will be the best toolchain for creating books.

Writing this book has consumed many evenings (and early mornings!) but it was well worth the effort for the experience. I hope the end result will be useful for the intended audience!

Team Geek

I just finished reading Team Geek: A Software Developer's Guide to Working Well with Others by Brian W. Fitzpatrick and Ben Collins-Sussman. The authors bring their experience of working on open source project teams (Subversion) as well as corporate software development teams (Google) and share what it takes to build great teams, create organizational change and manage your own career.

Although at times I felt that the book got a bit repetitive and could have done with tighter editing, overall, it is a great read with lots of hard earned wisdom on working in software teams. I've pulled out a few quotes which I particularly liked.

On not criticizing every single decision and learning to pick the right battles to fight:

Every time a decision is made, it’s like a train coming through town — when you jump in front of the train to stop it you slow the train down and potentially annoy the engineer driving the train. A new train comes by every 15 minutes, and if you jump in front of every train, not only do you spend a lot of your time stopping trains, but eventually one of the engineers driving the train is going to get mad enough to run right over you. So, while it’s OK to jump in front of some trains, pick and choose the ones you want to stop to make sure you’re only stopping the trains that really matter. -- Team Geek

On not tolerating people that threaten to poison your team's culture, even if they are "genius" programmers:

Genius is such a commodity these days that it’s not acceptable to be an eccentric any more. -- Team Geek

On attempting to change bad habits and behaviours:

It’s impossible to simply stop a bad habit; you need to replace it with a good habit. -- Team Geek

On learning to Manage Upward:

Shipping things gives you credibility, reputation, and political capital more than just about anything else in a company. -- Team Geek

On writing emails that get results:

A good Three Bullets and a Call to Action email contains (at most) three bullet points detailing the issue at hand, and one — and only one — call to action. That’s it — nothing more. -- Team Geek

On doing the Right Thing:

Do the right thing, wait to get fired. -- Chade-Meng Tan