By last Thursday I’d given up on my idea of using alloca() to fake growable blocks of memory in the stack. I had it executing bytecodes, 8 or 9 instructions, but the more I played with it the more I realised how hacky and fragile it was becoming. The extra add is definitely a bug, but weird stuff starts to happen if you do alloca(size - 15) to circumvent it and I’m not entirely convinced this isn’t because the extra 16 bytes you allocate mask another, PPC-specific issue. It’s difficult to tell. I decided I’d much rather fix OpenJDK to use non-stack locks once than fix alloca() on every single platform.

So, on Friday I junked the bits that used alloca() and reverted the stack class back to big-block-o’-ram mode. The orignal pre-alloca() version allocated the block in Thread::pd_initialize() but I wanted to make the Java stack match the runtime stack in terms of size and the initial thread’s stack isn’t set up when pd_initialize() is called so I moved it into the call stub. I ended Friday trying to figure out how I could make locking work.

Sometime over the weekend, though, I realised that call stub is in the stack whenever you’re in Java code: I could grab my big block of ram there, with alloca(), but in a non-hacky way that ought to work, and leave the locking code untouched!

Suddenly I’m at the System.currentTimeMillis() milestone, on the verge of massive success :)

My generic port is shaping up. I keep wanting to call it the portable interpreter except that’s what people used to call the C++ interpreter so I keep ending up calling it the really portable interpreter or some such thing. All the “CPU-specific” files live in hotspot/src/cpu/zero and hotspot/src/os_cpu/linux_zero so I guess it’s called zero (it was snappier than “nothing”). It’s come to mean zero-assembler in my head although there is some assembler in there, a spinpause and a 64-bit atomic copy, but if I can’t write those for a new platform within a day or two then I probably need sacking!

Anyway, whatever it’s called, it’s shaping up. I have a funky stack class with a slightly odd allocation strategy to allow it to live in alloca allocated memory, I’ve written my call stub using it and am now working on the normal entry. I plan to use the same calling convention throughout, the stack-based one that the bytecode interpreter uses. I have no specific native calling convention I need to match, and doing it that way saves writing a load of result convertors.

Ok, back to coding…