Lazy Initialization in Java

Take any Java application of a reasonable size and there's an almost 100% chance that at least one class in the codebase uses the lazy initialization pattern. One typical usage is in the creation of Singleton classes.

For a basic pattern in such widespread use, it is surprising how often it is implemented incorrectly! Why does this happen? Let's take a look with a simple example:

class Demo {

    private Collaborator collaborator = new Collaborator();

    public Collaborator getCollaborator() {
        return collaborator;
    }

    public static void main(String... args) {
        Demo demo = new Demo();
        Collaborator collaborator = demo.getCollaborator();
    }
}

Perfectly pedestrian stuff so far. We define a Demo class in which a Collaborator object is created and is ready for use as soon as an instance of Demo is created. But what if we don't always need the collaborator? We would be paying the cost of creating it even in situations where it isn't used. This becomes a real concern if it is relatively expensive to create a new collaborator. Enter lazy initialization:

class Demo {

    private Collaborator collaborator;

    public Collaborator getCollaborator() {
        if (collaborator == null) {
            collaborator = new Collaborator();
        }
        return collaborator;
    }
}

In the revamped Demo class, we've delayed the construction of Collaborator to when the getter method is called thus ensuring that we don't create an instance before we need it.

Although it solves our first problem, it introduces another one: the Demo class will likely exhibit unexpected behaviour in a multi-threaded environment. If two or more threads simultaneously invoke the getCollaborator() method on an instance of Demo, then it is very much possible that more than one instance of Collaborator gets created. Depending on what Collaborator actually is, the effects of this can range from simply wasteful to downright dangerous!

Making the Demo class' behaviour predictable in a multi-threaded environment is easy - just make the getCollaborator() method synchronized.

class Demo {

    private Collaborator collaborator;

    public synchronized Collaborator getCollaborator() {
        if (collaborator == null) {
            collaborator = new Collaborator();
        }
        return collaborator;
    }
}

By making the getCollaborator() method synchronized, we ensure that only one thread can invoke the method at a time and are thus guaranteed that only one instance of collaborator will be created.

However, there's yet another problem with our change. (In case you haven't guessed, this is the pattern for this entire post!)

The problem is that even though we needed to ensure exclusive access to the getter method only during the initial instantiation of Collaborator, we pay the cost of synchronization on all subsequent calls to the method!

Alright, let's try another change:

class Demo {

    private Collaborator collaborator;

    public Collaborator getCollaborator() {
        if (collaborator == null) {
            synchronized(this) {
                if (collaborator == null) {
                    collaborator = new Collaborator();
                }
            }
        }
        return collaborator;
    }
}

This is known as the double-checked lock pattern. What we are doing is first checking if the collaborator reference is null. If it is, we try to gain a lock on the Demo object instance (this). Once we hold the object lock, we need to check again if collaborator is still null. This double check is required because it is quite possible that between the time of the first check and the time we get the object lock, a different thread could come in, gain the lock and go ahead and construct a collaborator. So our second check is a defence against that. If we find that the collaborator is still null, we go ahead and construct one.

This seems right, doesn't it? Unfortunately, it is not.

This is where our intuitive sense of reasoning starts breaking down in the face of modern technology.

The problem (once again) is that modern compilers do this thing called instruction re-ordering or out-of-order execution. In fact, the Java Language Specification (JLS) even explicitly permits implementations to do instruction re-ordering code optimizations because it can improve execution speed. And Java isn't the only language doing this either; pretty much all modern programming language compilers do these optimizations.

We won't get into the details of how these optimizations work - that is the job of the JVM engineers! As application developers, our job is to know that these things happen and to ensure that our programs don't fail in the presence of these optimizations.

To better demonstrate the effect of instruction reordering, let me define a simple Collaborator class:

class Collaborator {
    public Associate associate;

    public Collaborator() {
        associate = new Associate();
    }
}

With the above class definition in mind, imagine an optimization where the constructor call is inlined in our double-checked Demo class:

class Demo {

    private Collaborator collaborator;

    public Collaborator getCollaborator() {
        if (collaborator == null) {
            synchronized(this) {
                if (collaborator == null) {
                    // psuedo code now
                    associate = new Associate();
                    collaborator = new Collaborator();
                }
            }
        }
        return collaborator;
    }
}

NOTE: I am not saying that this is how the code will look if the constructor is inlined. I am just asking you to visualize the fact that there are two reference assignment operations going on here: (1) the associate reference and (2) the collaborator reference.

Now since the JVM is free to re-order these instructions, it may choose to move the collaborator reference assignment before the associate reference assignment. If this happens, think of what happens if another thread comes in and calls getCollaborator() in between the collaborator reference store and the associate reference store? The second thread will reach the if (collaborator == null) check, find that collaborator is not null (since the store was already done!) and so it would skip the if block and return the collaborator reference.

Now with the collaborator that it got, if the second thread tries to do anything with collaborator.associate, it'll get an unexpected NullPointerException since the associate reference is still null!

This is how our intuition fails us and this is one of the reasons why folks keep saying that multi-threaded programming is hard!

Some of you may legitimately ask the question: shouldn't the synchronized block take care of this? Well, a synchronized block is essentially a 'monitor entry' operation at the start of the block and then a 'monitor exit' operation at the end of the block. The JLS guarantees that once we 'monitor exit', all other threads will see all the memory assignments that happened before the exit. However, and crucially, it makes no guarantee that other threads will not see these assignments before 'monitor exit'. See the problem?

So how do we fix this?

Solution 1

We got into this whole mess because we tried to do lazy initialization. So the first question is to ask ourselves if we really need to initialize our class lazily? If not, the safest is to just construct the object at Step 0:

class Demo {

    private final Collaborator collaborator = new Collaborator();

    public Collaborator getCollaborator() {
        return collaborator;
    }
}

The critical point to note here is the use of the final modifier. Without that, this class will not be thread-safe. Why? Suppose Thread 1 constructs a Demo instance and then hands it off to thread 2. In the absence of any synchronization, there is no guarantee that the second thread will see the collaborator reference assignment. To put it in a different way, the memory operations done by thread 1 are not guaranteed to be visible to thread 2 in the absence of synchronization. Declaring the reference as final is a great way of ensuring visibility without paying the cost of synchronization. This is made possible due to the special guarantees that the JLS provides for final fields. Go read up on it :-)

Solution 2

If eager initialization is not an option, this is how we can fix our double checked locking code:

class Demo {

    private volatile Collaborator collaborator;

    public Collaborator getCollaborator() {
        if (collaborator == null) {
            synchronized(this) {
                if (collaborator == null) {
                    collaborator = new Collaborator();
                }
            }
        }
        return collaborator;
    }
}

All we did was to add the volatile modifier to collaborator. By doing this, we invoke the JLS guarantee that reads and writes of volatile references shall not be re-ordered. This solves our earlier problem caused by non-apparent instruction reordering. Note that we still need the synchronized block!

There are performance implications to using volatile references but in most scenarios, they aren't too bad. At least on x86, a volatile read instruction is almost as cheap as a regular read. Volatile writes on the other hand are very expensive! If you wish to further optimize the above by reducing the number of volatile read operations, you can use a local variable:

class Demo {

    private volatile Collaborator collaborator;

    public Collaborator getCollaborator() {
        Collaborator tmp = collaborator;
        if (tmp == null) {
            synchronized(this) {
                tmp = collaborator;
                if (tmp == null) {
                    tmp = new Collaborator();
                    collaborator = tmp;
                }
            }
        }
        return tmp;
    }
}

Solution 3

In this final solution, we make use of another guarantee of the JLS: an inner class will not be initialized until it is referenced elsewhere.

class Demo {

    private static class CollaboratorHolder {
        public static final Collaborator collaborator = new Collaborator();
    }

    public Collaborator getCollaborator() {
        return CollaboratorHolder.collaborator;
    }
}

When the JVM loads our Demo class, it skips the initialization of the inner CollaboratorHolder class. It is only when a caller invokes the getCollaborator() method that the CollaboratorHolder class is initialized causing the construction of a new Collaborator object. Moreover, this code is 100% thread-safe since the JLS guarantees that class initialization is a serial operation. This pattern is known as the initialization on demand holder pattern.

Summary

As this simple pattern demonstrates, writing multi-threaded code that is both correct and fast is not an easy task. It is very important that we know our development platform and the facilities it provides. And as a meta-observation: we should try not to "optimize" code unless required!

Further Reading

Comments

You should have some sharing buttons so that I can easily post this on facebook/twitter/google plus (or all at once). Nice read. Thanks for the detailed explanation. Better than most of the threads on stack overflow.

Thanks Bala!

Reg. sharing: I find all those adornments distasteful. As a concession, I did add some meta tags to the post HTML so that pasting the article link in G+ or FB will correctly extract the title & description.

Deepak, nice article. Good explanation why double checked lock pattern might not behave as we expect. Santhi

Great article

Joshua Bloch claims that "a single-element enum type is the best way to implement a singleton"; see this SO post for details: http://stackoverflow.com/a/71399

I agree, the "social" buttons you see everywhere on the Web these days are disgusting. What's wrong with just hyperlinking?

@Scott - agreed about the Enum approach. But it is not lazy!

The post is good, but I see that you incorrectly defined the problem concerning the instructions reordering. there is no instructions reordering happens in the synchronized block, it is a visibility problem and that is why it is solved by either using immutability or volatile

@Hithem I disagree. The synchronized block already guarantees visibility. It's in the JLS!

The problem is that the construct as defined above (using just synchronized block) doesn't guarantee the order in which variables becomes visible to other threads. Hence the necessity to use volatile.

synchronized block guarantees visibility and ordering in code inside it's block.

As Volatile which is a weak form of synchronization guarantees ordering.

For optimizing the below code

class Demo {

private Collaborator collaborator;

public synchronized Collaborator getCollaborator() {
    if (collaborator == null) {
        collaborator = new Collaborator();
    }
    return collaborator;
}

}

we sub-divided the atomicity problem into visibility and atomicity problem as the following

class Demo {

private volatile Collaborator collaborator;

public Collaborator getCollaborator() {
    if (collaborator == null) {
        synchronized(this) {
            if (collaborator == null) {
                collaborator = new Collaborator();
            }
        }
    }
    return collaborator;
}

}

@Hithem - that's an interesting way to look at this and honestly, I hadn't considered it that way. So thinking about it, when you say this:

we sub-divided the atomicity problem into visibility and atomicity problem as the following

Which construct provides which guarantee? synchronized provides both visibility & atomicity while volatile is mostly about visibility.

@Hithem

Instruction reordering would happen inside the synchronized block like Deepak mentioned in the article. What's important to note here is all the JMM ordering guarantees rely on establishing happens-before relationships and you can establish happens-before relationships iff you synchronize on the same monitors, or read-write the same volatile variable, etc in both threads. If you merely synchronize in one thread, or synchronize on different monitors on different threads, then all bets are off. JVM makes no ordering guarantees in that scenario.

So in the above example, we do NOT want to synchronize on the monitor in all the threads because that's expensive. But if you don't synchronize at all, then all the readers might see partially constructed objects due to instruction re-ordering (or not see the constructed object at all due to lack of visibility). So once you introduce volatile, you get both visibility and ordering guarantees and together you get safe-publication.


Markdown formatting supported