gbenson.net

gbenson Uncategorized Thursday 13th March 2008Friday 14th March 2008 1 Minute

Xerxes Rånby sent me this photo today.

Published by gbenson

I make things // he/him View all posts by gbenson

Published Thursday 13th March 2008Friday 14th March 2008

15 thoughts on “”

axet says:

Thursday 13th March 2008 at 11:38

can you provide more details?
gbenson says:

Thursday 13th March 2008 at 12:02

It is a 200MHz armv5tejl with 64Mb ram that has been compiling IcedTea for the past few days.
Rene Wagner says:

Thursday 13th March 2008 at 13:42

I take it that’s zero? What’s the expected runtime performance penalty (if any) compared to a traditional hotspot port once zero is “finished”?

Thanks everyone for your efforts on this!

Rene
gbenson says:

Thursday 13th March 2008 at 15:04

Yeah, that’s zero. Performance is pretty bad at present, as it’s interpreter-only, but once I’ve made the build-system generic I hope to spend some time looking at libjit or llvm to see if they can be used to speed it up.
xerxesr says:

Thursday 13th March 2008 at 15:06

Some expected pros for icedtea with zero compared to the current J2SE-embedded for ARM obtainable in binary form from sun are:

1. The zero port will run on the latest fedora-8 arm release since it works with the new glibc.
2. The zero port will have full graphics support! Suns J2SE-embedded version can only be run headless.

Reguarding speed it will be unknown untill tested, right now i compile with -mthumb -mthumb-interwork gcc options in order to take advantage of the arm cpu fast thumb instructionset mixed with regular arm instructions.
xerxesr says:

Thursday 13th March 2008 at 15:10

3. The zero port will support java v5.0 bytecode compared to v1.4.2 with the J2SE-embedded
gbenson says:

Thursday 13th March 2008 at 15:11

Is -mthumb -mthumb-interwork something I should be building with, or does it not work on all ARMs?
xerxesr says:

Thursday 13th March 2008 at 15:15

-mthumb -mthumb-interwork
should work with all v5 arm cpus and later but only on models with a “t” in uname -m

parial thumb support are present in some v4 cpus but lacks interwork so those models might support
-mthumb

reference: http://www.arm.com/products/CPUs/architecture.html
xerxesr says:

Thursday 13th March 2008 at 21:35

The gcc compiler have several machine specific options for ARM that might generate better code directly targeted to a specific ARM cpu version thus enables support for NEON, Thumb-2 and VFP instructions.

A if the specific CPU type can be determined then quick way to archive better performace might be to add -march and -mcpu compile options for the specific arch and cpu type.

Thumb-2 support by setting correct -march
For example on a armv6t machine with thumb-2 support a -march=armv6t2 -mthumb -mthumb-interwork option will make gcc assume it can use thumb-2 instructions since it knows the archetecture supports it.

NEON fpu support by setting -mfpu=neon -mfloat-abi=softfp
This should only be used on NEON capable arhitectures like the ARMv7A

VFP fpu support by setting -mfpu=vfp -mfloat-abi=softfp This should only be used on armv5 hardvare with a VFP9-S co processor. How this can be determined at compiletime is unknown. Some armv7 systems have a built in VFPv3 that can be enabled with -mfpu=vfp3 -mfloat-abi=softfp

NO FPU? use -mabi=aapcs-linux -mfloat-abi=soft -meabi=4
Most ARM cpus does not have any built in fpu capabilities. On this generic case used on most arm systems some considerations might be needed or else floatingpoint performance will be slow. If no considerations are used then when a floatingpoint operation takes place the cpu runs into an illegal instruction since the lack of fpu hardware. Fortunally the linux kernel catches the exception and performs the floatingpoint operation and continue the application automagically. The penality is a contextswitch. By letting gcc generate soft float instructions on these architectures this contextswitch can be avoided and a 10x performance boost might be observed. The “cure” use -mabi=aapcs-linux -mfloat-abi=soft -meabi=4 on hardware with at least “e” in archetecture.. (i might be wrong on the “e”)
reference:
http://www.linuxdevices.com/articles/AT5920399313.html
http://wiki.debian.org/ArmEabiPort

In my case the i might add arm926ej-s -mcpu option since the devboard i am using runs on a ARM926EJ-S-based processor. gcc cant make use of the jazelle instructions “j” so i have to use armv5te for -march instead of armv5tejl, the “l” for littel endian is omitted in the option. Since i have a “e” cpu i think i might benefit from the “NO FPU” options above as well!
-march=armv5te -mcpu=arm926ej-s -mabi=aapcs-linux -mfloat-abi=soft -meabi=4
I will try a new build using these options after the one i am compiling now completes to see if it makes things faster.

The -mtune option looks promising too but it have been reported by other projects to add little/none benefits.

Phew! what a jungle!
I think the best aproach is “Demo or die” thus before we make the build system a mess we should at least demo the options on real hardware before setting strange compilation options as default that might brake the compile.

references:
http://www.imodulo.com/gnu/gcc/ARM-Options.html
https://support.codesourcery.com/GNUToolchain/target_arch1?@template=faq

Have a great day!
Xerxes
xerxesr says:

Thursday 13th March 2008 at 22:00

For the daredevils who just have to squeeze out every performace they can get try tweaking some gcc options for code generation conventions as well by adding -fomit-frame-pointer -ffast-math to the compile.

The -fomit-frame-pointer might save some cpu registers thus making more dense and fast code but will generate less debuggable code.

The -ffast-math is a make or break, som people reports that v4 gcc compilers generates both faster and more accurate code with this option.
xerxesr says:

Friday 14th March 2008 at 09:46

I took a discussion this morning with the fedora 8 arm people

The recommedation is:
For all default builds do not add any gcc machine dependent options or other options for improved code generation conventions. The default gcc flags should be trusted.

If one wants a optimised build then this should be a user and distribution choice. Distributions can make their gcc compilers add the apropiate optimization flags for the specific distributions, fedora should automatically compile with -mabi=aapcs-linux -mfloat-abi=soft -meabi=4, as those are the defaults gcc flags for that distribution.

So the good news are: less complicated build scripts!
Rene Wagner says:

Friday 14th March 2008 at 16:47

Platform independent JIT sounds like an exciting plan. I’m looking forward to the results of your experiments!

As for compiler flags, I’ll second the recommendation to rely on the defaults used by a standard compliant EABI toolchain.

I haven’t checked how exactly Fedora configures the toolchain, but I would expect the goal to be to cover a wide variety of targets. Debian’s EABI toolchain (armel port) generates code that will run on chips as old as ARM v4t. It is possible to cover v4 as well if you emulate the v4t BX instruction (http://lists.debian.org/debian-arm/2008/02/msg00095.html).

Thumb being faster is a bit of a myth. Binaries will be smaller because the instructions are shorter (and therefore admittedly load faster) but runtime performance suffers because on average you need more instructions to get the job done.

Thumb-2 is an altogether different ISA. I don’t know how its performance compares to ARM or Thumb, but I wouldn’t worry about it because few chips support it and even kernel support is still rather experimental at present (AFAIK anyway).

As far as generic compiler flags are concerned, some people like to use “-fomit-frame-pointer -frename-registers -Os”. In my experience -fomit-frame-pointer and -frename-registers are safe although they make debugging harder. -Os optimizes for code size and is reported to improve performance (I’ve never measured this myself). It comes at the price of occasionally miscompiling code that works fine with -O2. I ran into that recently somewhere in the JNI code of JamVM/GNU Classpath on i486 (“somewhere” because gdb didn’t provide any useful info at all).

So, unless you absolutely need the last bit of optimization possible, it’s probably not worth bothering with any exotic compiler flags.

Regards,

Rene
testman says:

Monday 17th March 2008 at 14:00

Impressive … What about a BSD port to a ARM1176JZF of zero ? Any luck the JIT would work ? I got one of such platform in mind ;-)
gbenson says:

Monday 17th March 2008 at 14:11

At present zero is Linux-specific. To make it work you’d have to a) patch the build system to add BSD as an OS, b) write all of hotspot/src/os/bsd, and c) write all of hotspot/src/os_cpu/bsd_zero. There is a BSD port of some form of the JDK somewhere, I hear, which would take care of a and b, assuming the licenses meshed.
testman says:

Thursday 27th March 2008 at 10:44

Tnx for the clarification :)

Published by gbenson

15 thoughts on “”

Leave a ReplyCancel reply