IcedTea on ARM!
Xerxes RĂ„nby sent me this photo today.

15 thoughts on “

  1. I take it that’s zero? What’s the expected runtime performance penalty (if any) compared to a traditional hotspot port once zero is “finished”?

    Thanks everyone for your efforts on this!

    Rene

  2. Yeah, that’s zero. Performance is pretty bad at present, as it’s interpreter-only, but once I’ve made the build-system generic I hope to spend some time looking at libjit or llvm to see if they can be used to speed it up.

  3. Some expected pros for icedtea with zero compared to the current J2SE-embedded for ARM obtainable in binary form from sun are:

    1. The zero port will run on the latest fedora-8 arm release since it works with the new glibc.
    2. The zero port will have full graphics support! Suns J2SE-embedded version can only be run headless.

    Reguarding speed it will be unknown untill tested, right now i compile with -mthumb -mthumb-interwork gcc options in order to take advantage of the arm cpu fast thumb instructionset mixed with regular arm instructions.

  4. The gcc compiler have several machine specific options for ARM that might generate better code directly targeted to a specific ARM cpu version thus enables support for NEON, Thumb-2 and VFP instructions.

    A if the specific CPU type can be determined then quick way to archive better performace might be to add -march and -mcpu compile options for the specific arch and cpu type.

    Thumb-2 support by setting correct -march
    For example on a armv6t machine with thumb-2 support a -march=armv6t2 -mthumb -mthumb-interwork option will make gcc assume it can use thumb-2 instructions since it knows the archetecture supports it.

    NEON fpu support by setting -mfpu=neon -mfloat-abi=softfp
    This should only be used on NEON capable arhitectures like the ARMv7A

    VFP fpu support by setting -mfpu=vfp -mfloat-abi=softfp This should only be used on armv5 hardvare with a VFP9-S co processor. How this can be determined at compiletime is unknown. Some armv7 systems have a built in VFPv3 that can be enabled with -mfpu=vfp3 -mfloat-abi=softfp

    NO FPU? use -mabi=aapcs-linux -mfloat-abi=soft -meabi=4
    Most ARM cpus does not have any built in fpu capabilities. On this generic case used on most arm systems some considerations might be needed or else floatingpoint performance will be slow. If no considerations are used then when a floatingpoint operation takes place the cpu runs into an illegal instruction since the lack of fpu hardware. Fortunally the linux kernel catches the exception and performs the floatingpoint operation and continue the application automagically. The penality is a contextswitch. By letting gcc generate soft float instructions on these architectures this contextswitch can be avoided and a 10x performance boost might be observed. The “cure” use -mabi=aapcs-linux -mfloat-abi=soft -meabi=4 on hardware with at least “e” in archetecture.. (i might be wrong on the “e”)
    reference:
    http://www.linuxdevices.com/articles/AT5920399313.html
    http://wiki.debian.org/ArmEabiPort

    In my case the i might add arm926ej-s -mcpu option since the devboard i am using runs on a ARM926EJ-S-based processor. gcc cant make use of the jazelle instructions “j” so i have to use armv5te for -march instead of armv5tejl, the “l” for littel endian is omitted in the option. Since i have a “e” cpu i think i might benefit from the “NO FPU” options above as well!
    -march=armv5te -mcpu=arm926ej-s -mabi=aapcs-linux -mfloat-abi=soft -meabi=4
    I will try a new build using these options after the one i am compiling now completes to see if it makes things faster.

    The -mtune option looks promising too but it have been reported by other projects to add little/none benefits.

    Phew! what a jungle!
    I think the best aproach is “Demo or die” thus before we make the build system a mess we should at least demo the options on real hardware before setting strange compilation options as default that might brake the compile.

    references:
    http://www.imodulo.com/gnu/gcc/ARM-Options.html
    https://support.codesourcery.com/GNUToolchain/target_arch1?@template=faq

    Have a great day!
    Xerxes

  5. For the daredevils who just have to squeeze out every performace they can get try tweaking some gcc options for code generation conventions as well by adding -fomit-frame-pointer -ffast-math to the compile.

    The -fomit-frame-pointer might save some cpu registers thus making more dense and fast code but will generate less debuggable code.

    The -ffast-math is a make or break, som people reports that v4 gcc compilers generates both faster and more accurate code with this option.

  6. I took a discussion this morning with the fedora 8 arm people

    The recommedation is:
    For all default builds do not add any gcc machine dependent options or other options for improved code generation conventions. The default gcc flags should be trusted.

    If one wants a optimised build then this should be a user and distribution choice. Distributions can make their gcc compilers add the apropiate optimization flags for the specific distributions, fedora should automatically compile with -mabi=aapcs-linux -mfloat-abi=soft -meabi=4, as those are the defaults gcc flags for that distribution.

    So the good news are: less complicated build scripts!

  7. Platform independent JIT sounds like an exciting plan. I’m looking forward to the results of your experiments!

    As for compiler flags, I’ll second the recommendation to rely on the defaults used by a standard compliant EABI toolchain.

    I haven’t checked how exactly Fedora configures the toolchain, but I would expect the goal to be to cover a wide variety of targets. Debian’s EABI toolchain (armel port) generates code that will run on chips as old as ARM v4t. It is possible to cover v4 as well if you emulate the v4t BX instruction (http://lists.debian.org/debian-arm/2008/02/msg00095.html).

    Thumb being faster is a bit of a myth. Binaries will be smaller because the instructions are shorter (and therefore admittedly load faster) but runtime performance suffers because on average you need more instructions to get the job done.

    Thumb-2 is an altogether different ISA. I don’t know how its performance compares to ARM or Thumb, but I wouldn’t worry about it because few chips support it and even kernel support is still rather experimental at present (AFAIK anyway).

    As far as generic compiler flags are concerned, some people like to use “-fomit-frame-pointer -frename-registers -Os”. In my experience -fomit-frame-pointer and -frename-registers are safe although they make debugging harder. -Os optimizes for code size and is reported to improve performance (I’ve never measured this myself). It comes at the price of occasionally miscompiling code that works fine with -O2. I ran into that recently somewhere in the JNI code of JamVM/GNU Classpath on i486 (“somewhere” because gdb didn’t provide any useful info at all).

    So, unless you absolutely need the last bit of optimization possible, it’s probably not worth bothering with any exotic compiler flags.

    Regards,

    Rene

  8. Impressive … What about a BSD port to a ARM1176JZF of zero ? Any luck the JIT would work ? I got one of such platform in mind ;-)

  9. At present zero is Linux-specific. To make it work you’d have to a) patch the build system to add BSD as an OS, b) write all of hotspot/src/os/bsd, and c) write all of hotspot/src/os_cpu/bsd_zero. There is a BSD port of some form of the JDK somewhere, I hear, which would take care of a and b, assuming the licenses meshed.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.