Understanding Generics with Collections

In Java (prior to 5.0), a lot of times you are compelled to downcast your object to a more specific one. For example, when you add a String to a List, and when you want to retrieve your String back then you need to downcast.

Downcast is inevitable. Moreover, adding objects of any type to the list is allowed and the developer is responsible to remember the type of each object and perform the appropriate downcast while retrieving them. This gives a way to the type safety problems in Java, as every downcast in the code is a potential case for the wicked ClassCastException. Generics have been introduced to rescue us from these situations, they let you mark the collection to contain elements of a particular data type only, say a List of Strings. The syntax to specify a List of Strings would look like this. (Note: ‘type’ refers to any class definition in Java, e.g. String, Integer, Collection, MyClass etc. are all denoted as types.)

The syntax is fairly simple, you need to specify the ‘type’ of the Collection in the angular brackets following the Collection type. Such types are known as Generic types or Parameterized types. We’ll get to know more about defining collection types and substitution in the following sections.

With the above syntax, it is not allowed to add objects or retrieve objects of any other type other than String to the above List and doing that would result in a compile-time error. This is much better than the ClassCastException at runtime, and would definitely save a lot of your development time, isn’t it? And more importantly the downcast should be made extinct now, as the type of the elements within the collection is explicity informed to the compiler through the Generic syntax.

Purpose of Generics
Generics make your program well formed enabling the compiler to perform enough type checks based on the static type information provided and avoids unexpected type errors that could occur at runtime. Let us get into more practical matters.

Collections and Substitution rules
Collections are the primary motivation for Generics in Java.

Let us take a look at some substitution rules. The following are some legal assignments:

Here is the substitution principle for collections, (Rule 1) RHS of the assignment should contain a Collection implementation compatible with that of LHS and generified with the same type as that of LHS. The last assignment two assignments are valid, these are allowed to provide compatibility of non-generic (prior to Java 5) code with the new generic approach and vice versa. But, if you compile your code with -Xlint:unchecked option, the last assignment results in a unchecked conversion warning. (Rule 2) Do not ignore such compilation warnings, as they indicate your code to be unsafe (could break at runtime with ClassCastExceptions).

Does this compile? No, Iterator is not a generic type and hence the assignment of iterator’s element to the String ‘s’ (line 6) fails with a compilation error. Basically, in line 4 we lost the type information while obtaining the iterator and so we need an explicit cast here. Hence, you need to make the Iterator parameterized with String type to avoid the explicit cast.

(Rule 3) When you get an iterator, keySet, entrySet or values from a collection, assign to an appropriate parameterized type as shown above. This is because, these methods are modified to return their corresponding generic types to benifit no-cast code. Most of the Java 5 aware IDEs can do this job for you automatically, rely on them.

The following assignments are invalid:

Though String is a subtype of Object, the second assignment is not allowed. Collection of Objects is a bigger set comprising of elements of various types (Strings, Integers, Cats, Dogs etc.), but a Collection of Strings strictly contains Strings and both of these cannot be equated (Rule 4). In programmatic sense if this were allowed, we would end up adding objects of any type to a List of Strings, defying the purpose of generics and hence this is not allowed.

Well, with the above restriction, how would you implement a method that accepts a collection of any type, iterate over it and print the elements? For such purposes, Wildcards are introduced for generic types to represent unknown collections.

Wildcards
We know that Object[] is the supertype of all arrays, similarly Collection<?> is the supertype of all generic collections which is pronounced as “Collection of unknown”. (Note: Collection<?> represents List<?>, ArrayList<?>, HashSet<?> etc. And Collection<?> is only a reference type and you cannot instantiate it, i.e. new ArrayList<?>() or new HashSet<?>() is not allowed.) (Rule 5) Collections parameterized with wildcards cannot be instantiated.

Using Collection<?> we can implement the iterate and print method as shown below.

Is Collection<?> same as plain old Collection? No, there are lot of differences between the plain old Collection, Collection<?> and Collection<Object>.

The following are the differences between them:

  • Collection<?> is a homogenous collection that represents a family of generic instantiations of Collection (i.e. Collection<String>, Collection<Integer> etc.)
  • Collection<Object> is a heterogenous collection or a mixed bag that contains elements of all types, close to the plain old Collection but not same
  • Collection<?> ensures that you don’t add aribtrary objects, as we do not know the type of the collection (Rule 6)
  • Collection<?> cannot be treated as a read-only collection, as it allows remove() and clear() operations
  • You can assign Collection<String> or Collection<Number> to a Collection<?> reference type, but not to a Collection<Object> (Refer to Rule 4)

Collection<?> appears very restrictive as you do not known the type information. When you obtain your elements from this collection you need to work with objects and would sometimes end up in explicit cast. So, strictly encourage Collection<?> when you need no type specific operations (Rule 7). But, there would be very few such use-cases in practice, where as more frequently you may need to perform operations on a base interface and you do not bother about the implementation type. In such cases you can benifit with the ‘bounded wildcards’.

Bounded wildcards
List<? extends Number> is an example of a bounded wildcard. This represents a homogenous List that contains elements that are subtypes of Number. Bounded wildcards only indicate an unknown type which is a subtype of Number.

So, we can obtain elements from the collection assuming the type to be a Number. But, you are not allowed to add anything to the collection as we do not know which subtype of Number the collection contains.

Differences between List<Number> and List<? extends Number>:

  • List<Number> is a heterogenous collection of Number objects (i.e. it can contain instances of Integer, Float, Long, etc.)
  • List<? extends Number> represents a homogenous collection of Number or its subtypes. It is instantiated with any of List<Integer>, List<Float> etc.

“? extends Type” is known as the upper bound, and we also have “? super Type” which is the lower bound where the unknown type denotes a super type of the specified Type. This is rarely useful with collections but could come handy when we define our own generic types.

We’ll see more about defining generic types, generic methods and type erasure semantics in my next posts.

SocialTwist Tell-a-Friend
Trackback

7 comments untill now

  1. Nice Article… waiting for ur next posts 🙂

  2. This is absolutely superb

  3. good work! I will also wait for your next article and possibly ask you to write on a particular thing if you dont mind. I appreciate your work. one more thing how do i keep track of your articles, somebody reply to this in your comments. thanks anyway.

  4. Rajendra @ 2008-04-04 10:19

    Good one.. Nice explaination abt the wildcards in Collections. Thanks allot

  5. Ori K Muthuswami @ 2008-04-11 04:37

    Very nice presentation for the easy understanding of Generics.

  6. hey nice article…

  7. Very Good Article.