Skeleton Coder

Saturday, September 30, 2006

Java Tutorials: Overloading is compile-time binding

Most beginners in Java get confused between Overloading and Overriding. One should understand that overloading is compile-time binding whereas overriding is runtime binding.

Have a look at the following example. There are three classes - Base, Derived and Test. As the name indicates class Derived extends class Base. The class Test has two overloaded methods with name methodA, with parameters Base and Derived respectively.





class Base{

}

class Derived extends Base{

}

class Test{
public void methodA(Base b){
System.out.println("Test.methodA(Base)");
}
public void methodA(Derived b){
System.out.println("Test.methodA(Derived)");
}

public static void main(String []args){
Test t = new Test();
Base b = new Base();
Base d = new Derived();

t.methodA(b);
t.methodA(d);

}
}


What is the output?

If your answer is


Test.methodA(Base)
Test.methodA(Derived)

This is wrong

For your surprise the answer is wrong. The actual output is



Test.methodA(Base)
Test.methodA(Base)


Surprised?

This is because overloading is compile-time binding. When the compiler sees the line t.methodA(d);
It checks the data type of 'd', which is declared as 'Base'. So it looks for the method, methodA(Base) and binds the call to this method and hence the result.

Let us look at another common problem. When the programmers think to override the 'equals', but endup really overloading the method, there by creating some unforeseen problems.

Have a look at the following code, and say whether the equals method and the hashCode are implemented in the correct way?





public class EqualsOverloadTest {

String id;

public EqualsOverloadTest(String id){
this.id = id;
}

public boolean equals(EqualsOverloadTest other){
return (other!=null) && this.id.equals(other.id);
}

public int hashCode() {
return id.hashCode();
}

}



In the first go, anyone will say the equals method is implemented correctly.
It follows all the constraints for the 'equals' method, and also implements the 'hashCode()' method following the same contract. But, if you look closer, you fill notice that, the 'equals' method really overloads the Object.equals(Object) method, instead of over loading it.

To prove that this won't work, let me give a simple program. In the main method, we are creating two EqualsOverloadTest objects with the same id. The two objects are added into the Set. Then we are printing the size of the set.





public static void main(String[] args) {
EqualsOverloadTest first = new EqualsOverloadTest("123");
EqualsOverloadTest second =
new EqualsOverloadTest(new String("123"));

System.out.println(first.equals(second));

Set set = new HashSet();
set.add(first);
set.add(second);
System.out.println(set.size());
}






We will expect the size of the Set to be '1' since the two Objects are equal. But it will print as '2'.
This is because we didn't override the 'equals' method. Whereas the first check with the equals method returned true, because we called the method as equals(EqualsOverloadTest), hence the proper method was called. But withing the set, it called the method equals(Object), which is not implemented, so uses the Object.equals(Object), which really checks whether they both are same instance or not. Hence we get an unexpected behaviour.



Summary:
Overloading is a static or compile-time binding and Overriding is dynamic or run-time binding.

Sunday, September 17, 2006

Java Tutorials: i = i++

This one of the frequently asked questions in interview. This question is frequently asked in 'C', but sometimes in 'Java' also.

What is the output of the program?


int i = 0;
i = i++;
System.out.println(i);


In C language, Dennis Ritchie has mentioned in the book "The C Programming Language", the behaviour is undefined and left to implementations. But most of the implementations produce the result as '0'.

Let us see, what will be the output in 'Java'?
When the questions is asked to many people, they immediately said the answer as '1'. They gave the following explanation.

This is wrong.
The line

i=i++;

is equivalent to

i=i;
i++;


That is, the value of i(0) is stored in the LHS, i.e. 'i'. Then the value of 'i' is incremented to '1' and then stored in 'i' and hence the result '1'.

Unfortunately, the answer is wrong. Let us see the reason.

i++ means post increment. It means, the value will be incremented after the operation is performed on it. It doesnt mean, the statement will be completed before execution. That is the value of the variable 'i' will be stored in a temporary location, then the value is incremented and then, the actual operation is performed, in this case assignment, on the value in the temporary location.

Hence,

i = i++;

is equivalent to,

int temp = i; // temp = 0
i++; // i=1
i = temp; // i = 0


Hence we get the result as '0'.

To get a clear understanding, let us look at the byte code for this operation.


public class Increment {

public static void main(String[] args) {
int i=0;
i=i++;
// System.out.println(i); // commented out to avoid unnecessary lines
}

}


Compile the file 'Increment.java'.

Java Byte Code:

In the command line, type,
> javap -c Increment

This will display the methods and the byte code for the Increment class. The byte code will be displayed in the form of mnemonics - human readable form.

The following lines will be printed:

C:\Projects\test\src>javap -c Increment
Compiled from "Increment.java"
public class Increment extends java.lang.Object{
public Increment();
Code:
0: aload_0
1: invokespecial #1; //Method java/lang/Object."":()V
4: return

public static void main(java.lang.String[]);
Code:
0: iconst_0
1: istore_1
2: iload_1
3: iinc 1, 1
6: istore_1
7: return
}


Let us concentrate on the byte code for the main method. Before getting in to this problem, you should have a good understanding about the Java Virtual Machine architecture.
Let us just have a brief introduction about Java Virtual Machine.

Java Virtual Machine
The JVM is Stack Based. That is, for each operation, the data will be pushed into the stack and from the stack the data will popped out to perform the operation. There is another data structure, typically an array to store the local variables.
The local variables are given ids which are just the index to the array. For a non-static method, 'this' reference will be at the array index '0', followed by method parameters and then the other local variables. For static methods, the parameters will start with '0' followed by local variables.

Program Illustration

Let us look at the mnemonics in main() method line by line.

Stack and Local Variables Array before the start of main() method.


| | args i
|---| --------------
| | | | |
|---| --------------
|___|
Stack Local Variable


0: iconst_0
The constant value '0' is pushed to the stack.


| | args i
|-----| --------------
| | | | |
|-----| --------------
| 0 |
-----
Stack Local Variable


1: istore_1
The top element of the stack is popped out and stored in the local variable with index '1'. That is 'i'.


| | args i
|-----| --------------
| | | | 0 |
|-----| --------------
| |
-----
Stack Local Variable


2: iload_1
The value at the location 1, is pushed into the stack.



| | args i
|-----| --------------
| | | | 0 |
|-----| --------------
| 0 |
-----
Stack Local Variable

3: iinc 1, 1
The value at the memory location '1' is incremented by '1'.


| | args i
|-----| --------------
| | | | 1 |
|-----| --------------
| 0 |
-----
Stack Local Variable

6: istore_1
The value at the top of the stack is stored to the memory location '1'. That is '0' is assigned to 'i'.


| | args i
|-----| --------------
| | | | 0 |
|-----| --------------
| |
-----
Stack Local Variable


Hence, we get the result as '0', and not '1'.

Unlike C/C++, this behaviour is guaranteed in Java.

I think this article will give a good insight about the Java Language and the Java VM.

Friday, September 15, 2006

Java Tutorials: ArrayList or Vector?

This is one of the famous questions that a Java beginner has in his mind. This is also a famous question asked in interviews. Following are the differences between ArrayList and Vector.

1. Vectors and Hashtable classes are available from the initial JDK 1.0. But, ArrayList and HashMap are added as a part of new Collections API since JDK 1.2.

2. Vectors and Hashtable are synchronized where as ArrayList and HashMap are unsynchronized.

When to use Vector? When to use ArrayList?

1. ArrayList is faster when compared to Vector since ArrayList is unsynchronized. So, if the List will be modified by only one thread, use ArrayList. If the list is a local variable, you can always use ArrayList.
2. If the List will be accessed by multiple threads, always use Vector, otherwise you should take care of synchronization manually.

To visualize the problem with synchronization, try the following code.

There is a Producer class that adds 5000 elements to the List (ArrayList/Vector). Another class, Consumer class removes 5000 elements from the same list. There are around 10 producer threads and 10 consumer threads.



class Producer implements Runnable {

private List list;

public Producer(List pList) {
list = pList;
}

public void run() {
System.out.println("Producer started");
for (int i = 0; i < 5000; i++) {
list.add(Integer.toString(i));
}
System.out.println("Producer completed");
}

}




class Consumer implements Runnable {
private List list;

public Consumer(List pList) {
list = pList;
}

public void run() {
System.out.println("Consumer started");
for (int i = 0; i < 5000; i++) {
while (!list.remove(Integer.toString(i))) {
// Just iterating till an element is removed
}

}
System.out.println("Consumer completed");
}
}



public class ListTest {

public static void main(String[] args) throws InterruptedException {
// List list = new Vector();
List list = new ArrayList();

for (int i = 0; i < 10; i++) {
Thread p1 = new Thread(new Producer(list));
p1.start();
}

for (int i = 0; i < 10; i++) {
Thread c1 = new Thread(new Consumer(list));
c1.start();
}
Thread.yield();

while (Thread.activeCount() > 1) {
Thread.sleep(100);
}

System.out.println(list.size());

}

}



Try running the program with ArrayList. You can see a number of ArrayIndexOutOfBoundException, Consumer threads will still keep waiting for more elements which wont be added because the Producer has terminated after throwing the Exception.

Now, change the line,
List list = new ArrayList();

to
List list = new Vector();

and run the program.

Now you can see a proper result.

This clearly explains why you should use Vector class when there are multiple threads in the system.

In this program, even if you remove the Consumer class and Consumer thread, you can see that the Producer will themselves throw Exception.

This is because, while adding an element to the ArrayList, it checks for the size of the Array. If the array size is not sufficient, a new array will be created, the elements will be copied to the new array. If the context switching if Threads happen at this place also, we will get ArrayIndexOutOfBoundException, or sometimes, you maynot get any Exception, but some elements will be missing, and many unexpected behaviours.

So always use Vector if there are multiple threads. The same rule applies to HashMap vs Hashtable, StringBuilder vs StringBuffer

Summary:
1. Use Vector if there are multiple threads and ArrayList if there is only a single thread.
2. Use Hashtable if there are multiple threads and HashMap if there is only a single thread.
3. Use StringBuffer if there are multiple threads and StringBuilder if there is only a single thread.

Wednesday, September 13, 2006

Java Tutorials: Why no 'unsigned'?

In the java programming language, we dont have the concept of 'unsigned' integers. Once question that arises in the minds of most beginners is this - Why no 'unsigned' integers in Java.

In languages like C, C++, might have used the unsigned integers extensively for declaring variables like 'age' which can never be negative. The programmers prefer ths same style of declaring the non-negative values as unsigned. But Java doesnt allow that.

Before answering why unsigned is not supported in Java, let us see the problems that can happen because of 'unsigned' type.

Let us say we store the age of a person.

unsigned int age;

Now, we have a problem, what if the age of the person is not known? we are comming up with a great plan of using a value '-1', which means 'not known'.

The requirement is, only if the users age is above 18 they should be allowed to perform some action. If they didnt specify the age, he should be blocked. Let us see, how to write the code...


if(age<18){>
printf("You should be above 18 years to access this feature");
return;
}


On seeing the code, it is common to assume that '-1' is less than '18' if the user didnt specify the age, so he will automatically blocked.

But if you check the C specifications, you can find out that any 'signed' variables will be converted to 'unsigned' automatically if both appear in the same expression.

It means, the value '-1' will be converted into its unsigned counter part i.e. 0xFFFF. This is definitly more greater that '18' thus allowing the user to continue the blocked activity.

This is just a trivial example, but this will make it very difficult to trace the problem in the real life projects. To avoid such potential problems, java removed the 'unsigned' keyword.

This reason may look weird, but there is not much to achieve introducing 'unsigned' when compared to the drawbacks. The omission of unsigned data types is controversial. There are reasons where unsigned could help. Example, having unsigned will increase the maximum limit for the data type. All other reasons revolve around this reason. But this could be solved by using a bigger numeral. Example, instead of unsigned int, we can use long. similarly for unsigned short, we can use 'int'. Still, 'long' doesnt have any alternative other than 'BigInteger' wrapper classes.

There are lots of 'Request For Enhancements (RFE)' to add unsigned. Here is one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4504839. They have listed a lots of reasons for having an unsigned integers, but almost all the reasons revolve around the 'size'.

Let us look at, whether it is possible to add the new feature without the drawback that we mentioned.

It looks possible for adding the new keyword without losing the type safety of Java - without losing the title 'stongly typed language'. Use the same level of type checking. Make it stict to use explicit casts to convert between unsigned and signed.

Let us say for 'unsigned int', java adds the keyword 'uint'. Then,


uint a = 10;
int b = 20;

uint c = a + b // should throw compilation error.


We should instead use,


uint a = 10;
int b = 20;

uint c = a + (uint)b // should work


This will guarantee the strong type checking nature of Java and satisfying the needs of the programmers - A win-win situation for both the Java architects and for the Java Programmers.