Inside Zero and Shark: Java threads and state transitions

Andrew Haley has been doing some work on Zero and Shark lately, and his questions have made me realise that while Zero and Shark are pretty small in comparison with the rest of HotSpot, they’re not the easiest of things to get a handle on. I decided to write some articles to try and present a kind of overview of it all, to make things easier for others in the future.

HotSpot is the Java Virtual Machine (JVM) of the OpenJDK project. Its name refers to it’s primary mode of operation, in which Java methods are initially executed by a profiling interpreter, and only after they have been executed a certain number of times are they be deemed “hot” enough to be compiled to native code by a Just In Time (JIT) compiler. The aim is to avoid wasting time compiling rarely-used methods, such that each method you compile to be the one that will improve performance the most.

If you look inside a running HotSpot process you’ll see a number of different threads. There will be VM threads that handle such things as garbage collection. There may be one or more compiler threads — these are the JITs. And there will be Java threads. These are the threads that are executing Java code, the threads we are interested in.

At any time, each Java thread will be in one of (essentially) three states. A thread that is _thread_in_Java is executing code that was written in Java, either by interpreting bytecode or by executing native code compiled by the JIT. A thread that is _thread_in_native is executing a native Java method — JNI code. And a thread that is _thread_in_vm is running code that is part of the VM rather than code that is part of the application.

Threads change state all over the place. Imagine you’re in a Java method (you’re _thread_in_Java) and you invoke a native method. That switches you to _thread_in_native. Then, your native code calls some VM function, and suddenly you’re _thread_in_vm. Maybe that VM function calls some Java code? Now you’re back in _thread_in_Java. And as those calls return the transitions happen in reverse.

When hacking on HotSpot you tend to avoid thread state transitions where possible because various things happen during them and some directions are not cheap. The most obvious example of this is that threads remain _thread_in_Java across method calls, such that if one non-native method calls another then no transition occurs. In fact, _thread_in_Java is something of a default state. If you look at the function in Zero that handles calls to native methods (CppInterpreter::native_entry, in cppInterpreter_zero.cpp) you’ll see that the transition to _thread_in_native is the very last thing to happen before the actual call itself, and that transitioning back to _thread_in_Java is the very first thing to happen once the native method returns. And whilst you’re in there, check out what happens during the transition back to _thread_in_Java. The transition from _thread_in_native to _thread_in_Java is one of the expensive ones.

That pretty much covers threads and state transitions. Next time I’ll explain some of how method invocation actually works.