Don’t use class literals as type-tokens

Generics were added to the Java language within J2SE 5.0, and there was much rejoicing. It was finally possible to deal with containers in a type-safe manner. Prior to the availability of generics, Java developers had to do things like this:

List people = new ArrayList();
people.add(new Person("Donkey Kong"));
people.add(new Person("Guybrush Threepwood"));

Person pirate = (Person) people.get(1);

This kind of code is very fragile since it is not easy to keep track what is inside a container. If at runtime, the object you retrieve is not of the type that you're expecting, you can get a ClassCastException. It is also remarkably easy to pollute a container by shoving objects of different types inside there, which makes it even more difficult to keep track of the types of the objects inside. Workarounds included littering code with instanceof checks, or creating a wrapper class (for example a class called PeopleList that would delegate to an internal List instance) around the container so that you could have control over the types of objects being inserted.

When generics finally arrived, people were ecstatic because now you could do things like this:

List<Person> people = new ArrayList<Person>();
people.add(new Person("Donkey Kong"));
people.add(new Person("Guybrush Threepwood"));

Person pirate = people.get(1); //It just works!

This meant no-more ugly workarounds, which means that things are awesome! Right?

Unfortunately not. When Sun introduced generics to Java, they had to do so in a manner that maintained backwards compatibility. Specifically, they had to make sure that they maintained binary compatibility with existing Java code, and also had to make sure that existing libraries could work with generic-types as well. Sun's solution to this was to subject generics to type erasure. What this means is that at runtime, generic-parameter information is lost and not retained (except in special cases). Instead, the Java compiler generates bytecode that performs explicit casts, essentially making it seem (at the bytecode level) as if you hadn't used generics at all. In most cases this was fine, since people still got the compile-time type-safety they desired. However, type-erasure also caused other problems; people soon discovered that the following was impossible:

public class Foo<T> {
    private T data;
    public Foo() {
        data = new T(); //doesn't compile!
    }
}

This is mainly due to the fact that people weren't (and some still aren't) aware that generics in Java are type-erased and furthermore, in other cases generics "just work". This is due to no fault of their own because Java is often advertised as supporting generic programming, without any mention of how they are implemented. In my opinion, this is a perfect example of abstraction-leakage since the details of how generics are actually implemented in Java leaks out, requiring programmers to know those details to understand why some usages of generics aren't valid.

So how do you write code that can find out what T is at runtime? The recommended solution was to use class-literals as runtime type-tokens. One of the changes made with the introduction of literals, was to make class instances generic as well. Where previously the language just had Class, it now has Class<T>, which lets you do things like this:

public class Foo<T> {
    private T data;
    private Foo(Class<T> type) {
        data = type.newInstance(); //exception handling omitted for brevity
    }
}

But this is a limited solution that is also semantically incorrect. This becomes evident when you try to do something like this:

Foo<Set<String>> fooSetString = new Foo(Set<String>.class); // halp

This code doesn't compile, and so the next thing to try might be to just pass in Set.class which, while it is not exactly what you want, should hopefully be good enough. But no dice:

error: incompatible types: Class<Set> cannot be converted to Class<Set<String>>
    Foo<Set<String>> fooStr = new Foo<Set<String>>(Set.class);

WTF? We're back to square one! The compiler is actually suggesting that we send in Class<Set<String>> which we can't even do (using a class-literal, anyway)! I think this is another great example of abstraction leakage, which leads to this strange dichotomy between compile-time and runtime semantics. It even looks like what the compiler is suggesting you do, is impossible! The non-obvious "solution", is to perform some extremely-ugly typecasts:

//my eyes!!!
Foo<Set<String>> fooSetString = new Foo((Class<Set<String>>) (Class<?>) Set.class);

Why does this happen? This is because a Class instance is a poor substitute for what you actually want to communicate: the type. Due to type-erasure, at runtime there is no difference between, say Set<String>.class and Set<Integer>.class; they are both represented by Set.class. This is just as well, since there is only one Set class anyway. What you actually want to do is communicate the type. There are ways to do this; a popular way is to create an ad-hoc, concrete-implementation of an interface with a generic type (i.e, an anonymous class), which causes the generic type-information to be persisted. This is how the Guava library does it, and it makes it very easy to deal with issues like this.

Even still, this kind of approach is a band-aid at best. I think at some point Java should implement fully-reified generics. I understand backwards compatibility, but we're talking about compatibility with code that was written more than a decade ago. Large organizations are already pretty slow to shift to newer versions anyway, so at some point Oracle should just put a hard stop at some version and say that in future versions, there will be fully-reified generics. I don't think preserving backwards-compatibility is a good argument when it causes way too much pain to write code with good type-safety (Disclaimer: I am aware Java's type-system has a bunch of other issues, but just reifying generics will give us so much!).

Category: Java, Nerdy Stuff, Programming and Development, Software

Don’t use class literals as type-tokens

1 thought on “Don’t use class literals as type-tokens”

Leave a Reply Cancel reply