Inside Zero and Shark: HotSpot’s stacks

Now that we understand what Java is expecting of the stack we can take a look at how HotSpot and Zero implement it. It would, of course, be perfectly possible to implement a stack exactly as described, but in practice that’s not how it’s done.

The first difference is pretty straightforward. Remember I said that when a method is called the interpreter pops its arguments from the stack and copies them to the callee’s local variables? Well, all that copying would be pretty inefficient, so HotSpot simply doesn’t bother. Imagine you’re executing some method. You’ve just pushed three values onto the stack, value_a, value_b and value_c, and you’re about to invoke a method that takes two arguments:

 
value_c
value_b
value_a

When it enters the callee, rather than popping the arguments, it leaves them where they are:

 
value_c local[1]
value_b local[0]
  value_a  

If the callee has more locals than it has arguments (say this method needs four) then extra locals (set to zero) will be pushed onto the stack:

 
  0   local[3]
0 local[2]
value_c local[1]
value_b local[0]
  value_a  

Execution continues as normal after that, with values being pushed onto the stack after the locals. As methods call methods call methods call methods, the stack becomes split into layers, with individual method’s stacks interleaved with blocks of local variables. When a method returns, everything up to and including local[0] is popped. If a method is to return a value, then that will popped before everything else and pushed back onto the stack afterwards. We’ve exchanged copying the arguments for copying the result, a good tradeoff given that methods can have many arguments but only one result.

So far so good, but there’s another difference. The stack I’ve been talking to until now is the stack of the Java language. In HotSpot this is variously referred to as the Java expression stack, the Java stack or the expression stack. But HotSpot is a program in itself, and it has a stack of its own, the ABI stack or native stack. This is the stack that C and C++ functions use to store their own bits and pieces on. HotSpot was originally written for i386, a platform notoriously starved of registers, and rather than maintaining two separate stack pointers the HotSpot engineers decided to store the Java stack on the ABI stack and save a register. Each time a Java method is invoked, the native code that executes it needs to store some state on the ABI stack, so chunks of Java stuff end up interleaved with chunks of ABI stuff between each method’s local variables and its part of the expression stack.

This tinkering with the ABI stack is one of the two reasons the C++ interpreter in HotSpot required a layer written in assembler — you don’t have that kind of access to the ABI stack in C++. Zero, of course, is written in C++, and doesn’t have that access; Zero maintains a separate stack, an instance of the ZeroStack class from stack_zero.hpp. That could have consigned this interleaving to a side-note from history, but sadly the C++ interpreter expects to find it’s state information stored between its local variables and its expression stack. Rather than rewriting the C++ interpreter, Zero interleaves too. It’s the path of least resistance.

You can see what I mean in this crash dump that — congratulations! — you are now qualified to understand. The trace is split into frames, with each frame representing one method invocation. The deepest frame is at the top, so here:

java.lang.Thread.run
called java.util.concurrent.ThreadPoolExecutor$Worker.run
called java.util.concurrent.ThreadPoolExecutor.runWorker
called sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run
called sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0
called java.net.Socket.getInputStream — which crashed.

The top frame’s expression stack is at 0xd0ffe7e4-0xd0ffe7ec, and its local variables are at start of the next frame down, at 0xd0ffe83c-0xd0ffe848. Between them is the C++ interpreter’s state (at 0xd0ffe7f0-0xd0ffe830) and two words which the stack walker uses to figure out where everything is.

I’m nearly finished, but there’s one final thing. The Java Language Specification specifies the sizes of the various types in terms of the number of stack slots or local variable slots they occupy: long and double values take two slots, and everything else takes one. If a method’s arguments are an int, an Object, a long and an int, it’s local variables on entry will look like this:

int   local[4]
long local[3]
local[2]
Object local[1]
int local[0]

This is pretty straightforward, aside from the fact that the long is officially in local[2] but its address is actually the address of local[3]. The problem arises when you’re using a 64-bit machine — the 64-bit Object pointer has been allocated the same number of slots as a 32-bit int. On 64-bit platforms, therefore, stack slots need to be 64-bits wide, which wastes space, and leaves us the choice of where in the slot to put non-Object types. The various classic HotSpot ports do this in different ways, but on Zero everything is accessed by slot number so values are positioned such that they start at the address of the start of the slot. This means the calculation the same regardless of whether the machine is 32- or 64-bit, and makes the majority of this stuff transparent. The same local variable array on 64-bit Zero looks like this:

int     local[4]
long local[3]
  local[2]
Object local[1]
int   local[0]

I’ll shut up about stacks now!