August 2008

Just a quick post to mention some changes I’ve been working on for IcedTea (the JDK 7 tree) and finally committed last night:

  • The build is now based around OpenJDK b33 (just as b34 is posted…). Such an update has been delayed, due firstly to CORBA build issues with b32 and then issues with javah and the new Java-based NIO generator in b33.
  • You can now build against something other than the JDK tree by using --enable-hg and --with-project. Current values for –with-project are caciocavallo, closures, cvmi and bsd.

Happy hacking :)

While working out where the class library makes calls to the VM can be tricky at times, such points are usually well-delimited (e.g. all in jvm.h in OpenJDK as we saw last time) and there are a variety of clues to help find them. Firstly, if a non-Java VM is in use, then the possible VM-utilising methods is already limited to those that are declared native. Secondly, failures in this area usually produce clear runtime errors such as linking errors (e.g. can’t resolve the symbol JVM_IHashCode on loading). Thus, the main issues to deal with are usually ‘what should this method do?’ and ‘what state(s) can I expect this method to be executed from?’.

Library calls made from the VM are much more subtle. First of all, they of course vary from VM to VM. Some VMs may choose to call up to the library much more often than others. Certainly, this will be the case with a Java-based VM where the boundaries between the two are less clear, being delimited by packages rather than language/linking boundaries (though this also applies a little in the opposite direction too). Over the past few weeks, I’ve been attempting to get JamVM to run with the OpenJDK class library, simply by swapping from GNU Classpath for the rt.jar from OpenJDK*. This has revealed a number of cases where both JamVM and HotSpot call methods in their corresponding class libraries and thus depend on their presence.

As mentioned briefly last time, one of the first things I noticed was that it was assumed that the VM boot process would link As CACAO has also done, I had to add this to JamVM’s boot process in initialiseNatives.
Another area where a unspoken relationship between the VM and the class library exists is in the 1.4 JNI NIO support. The VM needs to be able to create a java.nio.DirectByteBuffer which maps onto a native buffer underneath. Both OpenJDK/HotSpot and GNU Classpath/JamVM do this in a similar way, but there are two notable differences between them:

  1. JamVM relies on GNU Classpath’s java.nio.DirectByteBufferImpl$ReadWrite class, which doesn’t exist on OpenJDK. Sun’s implementation is simply java.nio.DirectByteBuffer.
  2. The pointer to the buffer is passed from HotSpot to the OpenJDK class library as a jlong (a Java long, not to be confused with a C long, which may vary in size; Java longs are always 64-bit). GNU Classpath encapsulates pointers in a wrapper class called either Pointer32 or Pointer64, both of which have the common superclass of gnu.classpath.Pointer. This has the advantage of making it clearer that the number being stored is a pointer, with the disadvantage of having to create an object instance.

For now, I’ve altered JamVM to use the OpenJDK/HotSpot way of doing things, but it may be worth providing a Pointer class in OpenJDK, as it does have safety advantages. The only other minor difference is that the GNU Classpath constructor seems to take a few more arguments, but the same values were being provided by both JamVM and the superconstructor call in the OpenJDK class library.

Both issues highlighted so far have fairly trivial solutions and involve minimal interaction between the VM and class library. However, the biggest interaction point between the two is at boot-time. It is the VM that is started by the user (or in the case of OpenJDK, the launcher which then starts the VM via JNI). However, the actual code needs to be executing with direct reference to the class library, not the VM. This, again, differs between native and non-native VMs, where the former have to also make the switch from native code to Java code before the user code can be executed.

The runtime overview of HotSpot, provided by the HotSpot team, provides a good overview of the boot process from HotSpot’s perspective, although it doesn’t go into full detail on the interaction with the class library. Here, we concern ourselves mainly with the call to JNI_CreateJavaVM by the launcher, which takes us into src/share/vm/prims/jni.cpp in the HotSpot VM, and the class and thread initialisation that follows.

Most of the work (2-12) actually goes on in src/share/vm/runtime/thread.cpp and the create_vm method. The main part of this which is of interest for VM–>library interaction is the process by which the native OS thread is linked to a java.lang.Thread object. HotSpot makes three calls to the class library:

  1. (create_initial_thread_group) It creates an instance of java.lang.ThreadGroup using its no-arg constructor. This constructor is private and so can only be accessed by HotSpot for this purpose. This default constructor creates the root thread group, “system”.
  2. (create_initial_thread_group) It creates another java.lang.ThreadGroup instance using the public constructor which takes a parent group and a name. It uses this to create a child of “system” called “main”.
  3. (create_initial_thread) Calls the public constructor of java.lang.Thread to create a thread in the “main” group with the name “main”.

Apart from this, it initialises several core classes including java.lang.String, java.lang.reflect.Method, java.lang.ref.Finalizer and java.lang.Class, along with a number of exception and error classes such as java.lang.OutOfMemoryError. Additional classes are initialised if some options are enabled, such as java.util.HashMap if aggressive optimisations are turned on.

Although parts of this are specific to HotSpot, there is an implicit assumption in the class library that, for example, certain classes will have been initialised before the VM is fully operational. Thus another implementing VM has to do the same, calling the same internal methods such as the private thread group constructor described above. GNU Classpath has similar requirements, and in both cases, it is important we make such requirements explicit.

Note that GNU Classpath’s handling of threads differs considerably and I think there is some room for improvement here. JamVM does not pass or create any thread groups. An internal constructor expects a java.lang.VMThread which is then stored in java.lang.Thread and used for later calls such as start. Notably, the VM has to remember to set the group after the constructor concludes, and the hierarchy is different; GNU Classpath provides a root group called “main” which is created on java.lang.Thread class initialisation and JamVM places the main thread in that. There is no “system” group.

Both solutions leave the management of the thread itself to the VM, and it’s hard to see what the advantage is of GNU Classpath storing the VMThread. Calling general VMThread methods with the Thread instance may be a better solution. Where possible, general work should be taken off the VM and into the class library, and so it would be better if the group addition was handled by the constructor of Thread rather than relying on the VM to remember to do it.

I’d welcome comments on how different VMs handle this bootup process and what is the best solution for ensuring that the method contract between VM and class library is well documented and adhered to.

* Note that this means we still JamVM as the launcher for now. CACAO’s OpenJDK implementation takes a different approach with CACAO merely providing a replacement into a JDK tree and thus using the same launcher as HotSpot.