I’ve been meaning to write this blog for a while, but other things have cropped up in the meantime. The topic is something that came up when trying to lower the number of warnings produced by a build of GNU Classpath.

While GNU Classpath has required a 1.5 capable compiler since 0.95, so we could implement things like java.lang.Enum, the use of generics and such has only largely being applied to the creation of new classes (like java.util.ServiceLoader) and the suppression of JAPI differences. The internal code, such as that in the gnu.* packages and the private variables within the java.* and javax.* classes has remained in 1.4 form. As a result, the compiler generates a lot of ‘unchecked’ warnings, mainly due to the use of ‘raw’ collections (I’ll explain what these are shortly). ecj is still our preferred compiler (being the most tested and the one we had available as Free Software first) and it generates far more warnings by default than Sun’s javac. At present count, Classpath CVS generates just over 10,000 warnings with ecj 3.3. We clearly need to cut this down so we can spot real problems. In the past, we’ve simply turned them off but it’s better for the codebase in general if these are properly cleaned up and in some cases it does cause bugs to be discovered.

So what’s the problem? Well, mainly it’s a case of most of the GNU Classpath code still looking like:

List list = new ArrayList();
list.add("Potato");

or:

Map map = new HashMap();
map.put("key", new Value());

Map, List, ArrayList and HashMap are now referred to as raw types because the versions in Java 1.5 and above have one or more type parameters. These type parameters can be used to tell us what is stored inside the collection. We should be using Map<K,V>, List<T>, ArrayList<T> and HashMap<K,V>, the parameterized types. The type parameters, (K, V and T in this case) can be used in methods to specify that the type of an argument or return value depends on the type given when the collection is created. Thus, Map<K,V> has:

V put(K key, V value)

as opposed to:

Object put(Object key, Object value)

K is the type of key used for the map, and V is the value. In our original example, our Map should be replaced with Map<String,Value> because it maps keys of type String to values of type Value. The main advantage of using these is you no longer need to cast elements when they are returned from the collection. So a get call on a List<T> or Map<K,V> returns a value of type T or V respectively, not simply an Object which has to be cast manually by the user. In reality, backwards compatibility means these casts are still being inserted by the compiler, but for it to do this it must have been determined to be safe to do so.

So how do we clean up these warnings? It means going through the code and adding appropriate type parameters to our collections. This isn’t always as easy as it sounds. In some cases, the collection will only be used with one type so it’s simply a matter of determining what that is (usually by looking for the casts when objects are retrieved from the collection — these casts can soon be removed). However, because raw collections take Objects as input, there can be a mix of types so a common supertype has to be found.

In some cases, this has to be Object. So, you may think, what’s the point of turning Map into Map<Object,Object>? Surely they are the same thing. No they’re not, and this is one of the more interesting aspects of collections and one you’ll especially come across when you have to deal with using a raw collection coming from legacy code without generating unchecked warnings. Map the raw type is actually equivalent to Map<?,?>, where ? represents a wildcard. Wildcards allow the use of existential types; instead of having a strict instantiation of a type parameter, we can refer to any type within certain bounds. By default, a wildcard has an upper bound of Object and a lower bound of null i.e. Map<?,?> is the same as Map<? extends Object super null, ? extends Object super null>. What does this say in English? It says that the keys and values of the map are of some type that extends Object but we don’t know exactly what. In contrast, Map<Object,Object> says that the keys and values must be Objects.

We don’t want to work with wildcard types practically because we can’t have variables of type ?. Instead, we use them when importing and exporting from parameterized types. For example, to import objects from a collection, addAll is not defined as:

void addAll(Collection<T> t)

because doing so would mean that we can only take objects from collections of exactly the same type. Instead, we want to also allow collections of some subtype. For example, a collection of Objects should be allowed to be filled from a Collection of Strings. The above signature doesn’t allow this, but:

void addAll(Collection<? extends T> t)

does. This says that the collection from which the elements are to be taken must contain objects of some type which is a subtype of T. A similar example for super is the use of Comparator<? super T>. When we want something that can compare two objects of type T, we can work with both something that can compare two elements of type T but we can also use a more general comparison method that compares some supertype. For example, a Comparator<Object> can be used to compare Strings. In this case, we can’t go the other way; it would be inappropriate to try using a Comparator for Integers to compare general Number instances.

The most confusing aspect of dealing with generics is how to handle legacy code where you can’t make the incoming type a parameterized type. Take the following legacy method:

  public List createFruits()
  {
    List x = new ArrayList();
    x.add("Strawberry");
    x.add("Banana");
    x.add("Pineapple");
    return x;
  }

This returns a raw type, List. Now imagine we can’t see the body of the method. All we know is that the method returns a List; we don’t know what is in that list. This is how the compiler sees the method.

We of course know that it contains String objects. So we try and pass it to a method that takes a List<String>:

  public void printList(List<String> l)
  {
    for (String s : l)
      System.out.println(s);
  }

The obvious solution to do this is:

printList(testFruits())

and this will compile, but it produces an unchecked warning:

warning: [unchecked] unchecked conversion
found   : java.util.List
required: java.util.List<java.lang.String>
    printList(createFruits());

So we take the obvious solution to this from the old 1.4 days and cast it:

printList((List<String>) testFruits())

Again, this compiles but we get a different unchecked warning:

warning: [unchecked] unchecked cast
found   : java.util.List
required: java.util.List<java.lang.String>
    printList((List<String>) createFruits());

So we get a warning with the cast, and one without. What do we do?

The answer is to take a step back and think about what this incoming List really is. As we noted before, the equivalent of List in the new 1.5 world is List<?> so:

List<?> l = (List<?>) testFruits();

No warnings, so far so good. This is deemed a safe cast because we are merely telling the compiler to move from the 1.4 to 1.5 version of the same thing. But how do we change this into a List<String>?

Remember that the ? wildcard without explicit bounds is telling us that the most we know about the contents of the list is that they are some subtype of Object. As such, the only thing we can safely retrieve them as is Objects:

    List<Object> newList = new ArrayList<Object>();
    for (Object o : l)
      newList.add(o);
    printList(newList);

This takes each Object from the list and puts it in a new list which holds Objects. By doing so we have removed the doubt about what is in the collection and telling the compiler to simply treat them all as Objects. What we have now is equivalent to what we thought we had to start with. However, this still won’t work with our printList method:

printList(java.util.List<java.lang.String>) in Test cannot be applied to (java.util.List<java.lang.Object>)
    test.printList(newList);

This is an error so the code will now not even compile. This is good; we don’t want the compiler to allow us to pass collections of mere Objects to methods requiring collections of Strings. This would take us straight back to Java 1.4 days. The solution is to create a List<String> instead of a List<Object> and check that each objects is a String in the body of the loop which adds them to the collection.

    List<String> newList = new ArrayList<String>();
    for (Object o : l)
      if (o instanceof String)
        newList.add((String) o);
    printList(newList);

Finally we have working code which has no warnings, while still using the legacy code. This however can be quite inefficient; in some cases, we want to avoid iterating over the entire collection when we know this is going to happen anyway. A common place where this happens is using the addAll method of collections. The addAll will take each object and cast it in adding it to the list anyway (the retrieval from the producer list will generate such a cast). In these cases, we can use the annotation @SuppressWarnings(“unchecked”) to turn off the warning we know is superfluous.

This should be used with care. It should also cover the minimum area possible to avoid suppressing other warnings. Annotations can go on individual assignments so there is no need to suppress warnings for the entire method. For example, here is Classpath’s getAnnotation method:

  public <T extends Annotation> T getAnnotation(Class>T> annotationClass)
  {
    // Inescapable as the VM layer is 1.4 based.
    @SuppressWarnings("unchecked")
      T ann = (T) cons.getAnnotation(annotationClass);
    return ann;
  }

cons.getAnnotation will return something of type Annotation as our VM layer is strictly 1.4 only. As we know from the input class that the annotation will be of type T, we can forcibly apply a cast and disable the warning. Note that the suppression applies only to the one line, and the explicit assignment of ann is used to allow this (an annotation can’t be attached to a return statement). We also add a comment to explain the reasoning behind adding this annotation. This should be used sparingly and where possible generics should be used properly. In some cases, we don’t even need to convert the collection; retrieving the size can be achieved simply from a List<?> or similar.

My thanks to Joshua Bloch and his ‘Effective Java’ book for finally explaining some of the solutions documented here. I didn’t realise until reading this that annotations could be applied to such a narrow scope or that it was safe to cast to a wildcard type from a raw type. This has enabled me to clean up a lot of the Classpath code.

Congratulations also to the IcedTea team, especially the OpenJDK Debian Team, for getting openjdk-6 into sid!

And finally, congratulations to Mark and Petri on the birth of their son, Jonas :)