I started tearing Shark apart yesterday. I had a chat with Tom Tromey on Saturday and he mentioned some properties of Java bytecode of which I was unaware — like, you can compile it without using a stack pointer. I was trying to figure out how to modify my first pass to generate some of this stuff when a reply to a seemingly unrelated question made me realise that not only does the server JIT have a pass that does all this and more, but it’s been abstracted out for easy reuse! So stay tuned for the Shark that keeps its stack and locals in registers :)
Today I got my second method working, String.hashCode()
. Now I have conditional and unconditional branching, field access, array loads, a whole bunch of integer operators, and returning with a result implemented and (somewhat) tested. The bytecode coverage chart says I’m 50% done, but I don’t believe it.
Well, it’s taken a month and a half — and over 2000 lines of code — but I finally got a method out of Shark.
I made a chart showing which bytecodes are implemented, which I’ll keep updated as I progress. The estimated total coverage of 18% is slightly fanciful as it treats all bytecodes as equally complex, with nop
having the same weight as new
for example. Some codes are marked as complete but untested too. The way the compiler is structured means that in simple cases I can copy and paste whole blocks of bytecodes from the server compiler, so where I was doing one bytecode in a block I’ve copied the lot across. Most of them ought to be fine, but a couple are dubious. I’m still shuffling things around to try and make things less so.
Onwards…
Shark just JITted its first bytecode:
load i32* %local0 ;:15 [#uses=1] inttoptr i32 %sp to i32* ; :16 [#uses=1] store i32 %15, i32* %16 %sp1 = sub i32 %sp, 4 ; [#uses=2]
Ladies and gentlemen, it is aload_0
!
Yesterday I had a mini-milestone in that I got enough bits and pieces written to have HotSpot call my JIT’s compile_method()
method. The next trick will be to get a piece of code back into HotSpot without it being relocated on the way.
Of course, the zero interpreter doesn’t have profiling, so it only works if you force it with -Xcomp
at the moment.
Also, I disabled InlineIntrinsics
, which reminded me that there was some kind of fast accessor thing for JNI that I disabled in the original ppc port. I should fix that at some point.
Towards the end of last week I started trying to figure out how to slot an LLVM-based JIT into IcedTea. I had a vague idea that I could simply port C1 to it, but I’m not sure that would work so well. Both existing JITs expect to manage register allocation, for example, which LLVM will have to do. So I started hacking at the build system to have it build an entirely new JIT. I called it “shark” — I was going to call it C0 what with the other JITs being called C1 and C2, but I didn’t like the mild confusion that having them named and numbered causes, and shark was too good a name to waste on a one-off dummy architecture.
And speaking of one-off dummy architectures, my “if I can build on a platform that doesn’t even exist then I can build on anything” proved short lived — it doesn’t help on architectures that do exist but have really stupid unames that need truncating. I’m talking about ARM, of course, uname freak of the cosmos. So I’m going to try and get that working this week too.
As of about an hour ago it should be possible to build icedtea6 on any Linux system with no more effort than ./configure
and make
.
I’m in the final stages of a build system for zero that will work on any architecture. For kicks, I hacked my /bin/uname
to say “shark” instead of “ppc”. I figure if I can build on a platform that doesn’t even exist then I can pretty much build on anything.
I got two emails with zero patches yesterday. One I was expecting, from Yi Zhan of Intel who has it working on ia64; the other was totally out of the blue, from Ruediger Oertel of SuSE who has it building on s390 and s390x. I always hoped it would Just Work™ but I never quite believed it! Now all I need is the time to integrate it all!
Which I may just have. While I was poking around trying to fix the IcedTea 7 stack overflow I discovered that a) we mark a lot of stack unusable with 64kb pages, and b) the stack region locator doesn’t seem to work, at least for the cases with guard pages, or for ia64, or possibly with 64kb pages when the Java thread is an initial thread. I spent some time looking into it and rewriting it, and I think I have it sussed. (If you ever wanted to know where ia64 puts its guard pages then you need this!) Now it all seems to work I’m going to spend the afternoon writing some emails to the HotSpot list, then next week I intend to hole up and implement my holy grail of a zero build system that does not require any patching to build on a new architecture.
So don’t expect any action from me until that’s done :)