Welcome Message

Hi, welcome to my website. This is a place where you can get all the questions, puzzles, algorithms asked in interviews and their solutions. Feel free to contact me if you have any queries / suggestions and please leave your valuable comments.. Thanks for visiting -Pragya.

November 20, 2009

Very Good Java Interview Questions

What if the main method is declared as private?

The program compiles properly but at runtime it will give “Main method not public.” message.

What is meant by pass by reference and pass by value in Java?

Pass by reference means, passing the address itself rather than passing the value. Pass by value means passing a copy of the value.

If you’re overriding the method equals() of an object, which other method you might also consider?

hashCode()

What is Byte Code?

Or

What gives java it’s “write once and run anywhere” nature?

All Java programs are compiled into class files that contain bytecodes. These byte codes can be run in any platform and hence java is said to be platform independent.

Expain the reason for each keyword of public static void main(String args[])?

public- main(..) is the first method called by java environment when a program is executed so it has to accessible from java environment. Hence the access specifier has to be public.

static: Java environment should be able to call this method without creating an instance of the class , so this method must be declared as static.

void: main does not return anything so the return type must be void

The argument String indicates the argument type which is given at the command line and arg is an array for string given during command line.

What are the differences between == and .equals() ?

Or

what is difference between == and equals

Or

Difference between == and equals method

Or

What would you use to compare two String variables - the operator == or the method equals()?

Or

How is it possible for two String objects with identical values not to be equal under the == operator?

The == operator compares two objects to determine if they are the same object in memory i.e. present in the same memory location. It is possible for two String objects to have the same value, but located in different areas of memory.

== compares references while .equals compares contents. The method public boolean equals(Object obj) is provided by the Object class and can be overridden. The default implementation returns true only if the object is compared with itself, which is equivalent to the equality operator == being used to compare aliases to the object. String, BitSet, Date, and File override the equals() method. For two String objects, value equality means that they contain the same character sequence. For the Wrapper classes, value equality means that the primitive values are equal.

public class EqualsTest {

public static void main(String[] args) {

String s1 = “abc”;
String s2 = s1;
String s5 = “abc”;
String s3 = new String(”abc”);
String s4 = new String(”abc”);
System.out.println(”== comparison : ” + (s1 == s5));
System.out.println(”== comparison : ” + (s1 == s2));
System.out.println(”Using equals method : ” + s1.equals(s2));
System.out.println(”== comparison : ” + s3 == s4);
System.out.println(”Using equals method : ” + s3.equals(s4));
}
}

Output
== comparison : true
== comparison : true
Using equals method : true
false
Using equals method : true

What if the static modifier is removed from the signature of the main method?

Or

What if I do not provide the String array as the argument to the method?

Program compiles. But at runtime throws an error “NoSuchMethodError”.

Why oracle Type 4 driver is named as oracle thin driver?

Oracle provides a Type 4 JDBC driver, referred to as the Oracle “thin” driver. This driver includes its own implementation of a TCP/IP version of Oracle’s Net8 written entirely in Java, so it is platform independent, can be downloaded to a browser at runtime, and does not require any Oracle software on the client side. This driver requires a TCP/IP listener on the server side, and the client connection string uses the TCP/IP port address, not the TNSNAMES entry for the database name.

What is the difference between final, finally and finalize? What do you understand by the java final keyword?

Or

What is final, finalize() and finally?

Or

What is finalize() method?

Or

What is the difference between final, finally and finalize?

Or

What does it mean that a class or member is final?

o final - declare constant
o finally - handles exception
o finalize - helps in garbage collection

Variables defined in an interface are implicitly final. A final class can’t be extended i.e., final class may not be subclassed. This is done for security reasons with basic classes like String and Integer. It also allows the compiler to make some optimizations, and makes thread safety a little easier to achieve. A final method can’t be overridden when its class is inherited. You can’t change value of a final variable (is a constant). finalize() method is used just before an object is destroyed and garbage collected. finally, a key word used in exception handling and will be executed whether or not an exception is thrown. For example, closing of open connections is done in the finally method.

What is the Java API?

The Java API is a large collection of ready-made software components that provide many useful capabilities, such as graphical user interface (GUI) widgets.

What is the GregorianCalendar class?

The GregorianCalendar provides support for traditional Western calendars.

What is the ResourceBundle class?

The ResourceBundle class is used to store locale-specific resources that can be loaded by a program to tailor the program’s appearance to the particular locale in which it is being run.

Why there are no global variables in Java?

Global variables are globally accessible. Java does not support globally accessible variables due to following reasons:

* The global variables breaks the referential transparency
* Global variables creates collisions in namespace.

How to convert String to Number in java program?

The valueOf() function of Integer class is is used to convert string to Number. Here is the code example:
String numString = “1000″;
int id=Integer.valueOf(numString).intValue();

What is the SimpleTimeZone class?

The SimpleTimeZone class provides support for a Gregorian calendar.

What is the difference between a while statement and a do statement?

A while statement (pre test) checks at the beginning of a loop to see whether the next loop iteration should occur. A do while statement (post test) checks at the end of a loop to see whether the next iteration of a loop should occur. The do statement will always execute the loop body at least once.

What is the Locale class?

The Locale class is used to tailor a program output to the conventions of a particular geographic, political, or cultural region.

Describe the principles of OOPS.

There are three main principals of oops which are called Polymorphism, Inheritance and Encapsulation.

Explain the Inheritance principle.

Inheritance is the process by which one object acquires the properties of another object. Inheritance allows well-tested procedures to be reused and enables changes to make once and have effect in all relevant places

What is implicit casting?

Implicit casting is the process of simply assigning one entity to another without any transformation guidance to the compiler. This type of casting is not permitted in all kinds of transformations and may not work for all scenarios.

Example

int i = 1000;

long j = i; //Implicit casting

Is sizeof a keyword in java?

The sizeof operator is not a keyword.

What is a native method?

A native method is a method that is implemented in a language other than Java.

In System.out.println(), what is System, out and println?

System is a predefined final class, out is a PrintStream object and println is a built-in overloaded method in the out object.

What are Encapsulation, Inheritance and Polymorphism

Or

Explain the Polymorphism principle. Explain the different forms of Polymorphism.

Polymorphism in simple terms means one name many forms. Polymorphism enables one entity to be used as a general category for different types of actions. The specific action is determined by the exact nature of the situation.

Polymorphism exists in three distinct forms in Java:
• Method overloading
• Method overriding through inheritance
• Method overriding through the Java interface

What is explicit casting?

Explicit casting in the process in which the complier are specifically informed to about transforming the object.

Example

long i = 700.20;

int j = (int) i; //Explicit casting

What is the Java Virtual Machine (JVM)?

The Java Virtual Machine is software that can be ported onto various hardware-based platforms

What do you understand by downcasting?

The process of Downcasting refers to the casting from a general to a more specific type, i.e. casting down the hierarchy

What are Java Access Specifiers?

Or

What is the difference between public, private, protected and default Access Specifiers?

Or

What are different types of access modifiers?

Access specifiers are keywords that determine the type of access to the member of a class. These keywords are for allowing
privileges to parts of a program such as functions and variables. These are:
• Public : accessible to all classes
• Protected : accessible to the classes within the same package and any subclasses.
• Private : accessible only to the class to which they belong
• Default : accessible to the class to which they belong and to subclasses within the same package

Which class is the superclass of every class?

Object.

Name primitive Java types.

The 8 primitive types are byte, char, short, int, long, float, double, and boolean.

What is the difference between static and non-static variables?

Or

What are class variables?

Or

What is static in java?

Or

What is a static method?

A static variable is associated with the class as a whole rather than with specific instances of a class. Each object will share a common copy of the static variables i.e. there is only one copy per class, no matter how many objects are created from it. Class variables or static variables are declared with the static keyword in a class. These are declared outside a class and stored in static memory. Class variables are mostly used for constants. Static variables are always called by the class name. This variable is created when the program starts and gets destroyed when the programs stops. The scope of the class variable is same an instance variable. Its initial value is same as instance variable and gets a default value when its not initialized corresponding to the data type. Similarly, a static method is a method that belongs to the class rather than any object of the class and doesn’t apply to an object or even require that any objects of the class have been instantiated.
Static methods are implicitly final, because overriding is done based on the type of the object, and static methods are attached to a class, not an object. A static method in a superclass can be shadowed by another static method in a subclass, as long as the original method was not declared final. However, you can’t override a static method with a non-static method. In other words, you can’t change a static method into an instance method in a subclass.

Non-static variables take on unique values with each object instance.

What is the difference between the boolean & operator and the && operator?

If an expression involving the boolean & operator is evaluated, both operands are evaluated, whereas the && operator is a short cut operator. When an expression involving the && operator is evaluated, the first operand is evaluated. If the first operand returns a value of true then the second operand is evaluated. If the first operand evaluates to false, the evaluation of the second operand is skipped.

How does Java handle integer overflows and underflows?

It uses those low order bytes of the result that can fit into the size of the type allowed by the operation.

What if I write static public void instead of public static void?

Program compiles and runs properly.

What is the difference between declaring a variable and defining a variable?

In declaration we only mention the type of the variable and its name without initializing it. Defining means declaration + initialization. E.g. String s; is just a declaration while String s = new String (”bob”); Or String s = “bob”; are both definitions.

What type of parameter passing does Java support?

In Java the arguments (primitives and objects) are always passed by value. With objects, the object reference itself is passed by value and so both the original reference and parameter copy both refer to the same object.

Explain the Encapsulation principle.

Encapsulation is a process of binding or wrapping the data and the codes that operates on the data into a single entity. This keeps the data safe from outside interface and misuse. Objects allow procedures to be encapsulated with their data to reduce potential interference. One way to think about encapsulation is as a protective wrapper that prevents code and data from being arbitrarily accessed by other code defined outside the wrapper.

What do you understand by a variable?

Variable is a named memory location that can be easily referred in the program. The variable is used to hold the data and it can be changed during the course of the execution of the program.

What do you understand by numeric promotion?

The Numeric promotion is the conversion of a smaller numeric type to a larger numeric type, so that integral and floating-point operations may take place. In the numerical promotion process the byte, char, and short values are converted to int values. The int values are also converted to long values, if necessary. The long and float values are converted to double values, as required.

What do you understand by casting in java language? What are the types of casting?

The process of converting one data type to another is called Casting. There are two types of casting in Java; these are implicit casting and explicit casting.

What is the first argument of the String array in main method?

The String array is empty. It does not have any element. This is unlike C/C++ where the first element by default is the program name. If we do not provide any arguments on the command line, then the String array of main method will be empty but not null.

How can one prove that the array is not null but empty?

Print array.length. It will print 0. That means it is empty. But if it would have been null then it would have thrown a NullPointerException on attempting to print array.length.

Can an application have multiple classes having main method?

Yes. While starting the application we mention the class name to be run. The JVM will look for the main method only in the class whose name you have mentioned. Hence there is not conflict amongst the multiple classes having main method.

When is static variable loaded? Is it at compile time or runtime? When exactly a static block is loaded in Java?

Static variable are loaded when classloader brings the class to the JVM. It is not necessary that an object has to be created. Static variables will be allocated memory space when they have been loaded. The code in a static block is loaded/executed only once i.e. when the class is first initialized. A class can have any number of static blocks. Static block is not member of a class, they do not have a return statement and they cannot be called directly. Cannot contain this or super. They are primarily used to initialize static fields.

Can I have multiple main methods in the same class?

We can have multiple overloaded main methods but there can be only one main method with the following signature :

public static void main(String[] args) {}

No the program fails to compile. The compiler says that the main method is already defined in the class.

Explain working of Java Virtual Machine (JVM)?

JVM is an abstract computing machine like any other real computing machine which first converts .java file into .class file by using Compiler (.class is nothing but byte code file.) and Interpreter reads byte codes.

How can I swap two variables without using a third variable?

Add two variables and assign the value into First variable. Subtract the Second value with the result Value. and assign to Second variable. Subtract the Result of First Variable With Result of Second Variable and Assign to First Variable. Example:

int a=5,b=10;a=a+b; b=a-b; a=a-b;

An other approach to the same question

You use an XOR swap.

for example:

int a = 5; int b = 10;
a = a ^ b;
b = a ^ b;
a = a ^ b;

What is data encapsulation?

Encapsulation may be used by creating ‘get’ and ’set’ methods in a class (JAVABEAN) which are used to access the fields of the object. Typically the fields are made private while the get and set methods are public. Encapsulation can be used to validate the data that is to be stored, to do calculations on data that is stored in a field or fields, or for use in introspection (often the case when using javabeans in Struts, for instance). Wrapping of data and function into a single unit is called as data encapsulation. Encapsulation is nothing but wrapping up the data and associated methods into a single unit in such a way that data can be accessed with the help of associated methods. Encapsulation provides data security. It is nothing but data hiding.

What is reflection API? How are they implemented?

Reflection is the process of introspecting the features and state of a class at runtime and dynamically manipulate at run time. This is supported using Reflection API with built-in classes like Class, Method, Fields, Constructors etc. Example: Using Java Reflection API we can get the class name, by using the getName method.

Does JVM maintain a cache by itself? Does the JVM allocate objects in heap? Is this the OS heap or the heap maintained by the JVM? Why

Yes, the JVM maintains a cache by itself. It creates the Objects on the HEAP, but references to those objects are on the STACK.

What is phantom memory?

Phantom memory is false memory. Memory that does not exist in reality.

Can a method be static and synchronized?

A static method can be synchronized. If you do so, the JVM will obtain a lock on the java.lang.
Class instance associated with the object. It is similar to saying:

synchronized(XYZ.class) {

}

What is difference between String and StringTokenizer?

A StringTokenizer is utility class used to break up string.

Example:

StringTokenizer st = new StringTokenizer(”Hello World”);

while (st.hasMoreTokens()) {

System.out.println(st.nextToken());

}

Output:

Hello

World

November 19, 2009

SQL questions

How do you implement one-to-one, one-to-many and many-to-many
relationships while designing tables?

One-to-One relationship can be implemented as a single table and
rarely as two tables with primary and foreign key relationships.
One-to-Many relationships are implemented by splitting the data into
two tables with primary key and foreign key relationships.
Many-to-Many relationships are implemented using a junction table with
the keys from both the tables forming the composite primary key of the
junction table.

SQL Interview Ques

Define candidate key, alternate key, composite key.

A candidate key is one that can identify each row of a table uniquely. Generally a candidate key becomes the primary key of the table. If the table has more than one candidate key, one of them will become the primary key, and the rest are called alternate keys.

A key formed by combining at least two or more columns is called composite key.

What are defaults? Is there a column to which a default can't be bound?
A default is a value that will be used by a column, if no value is supplied to that column while inserting data. IDENTITY columns and timestamp columns can't have defaults bound to them. See CREATE DEFUALT in books online.

Whar is an index? What are the types of indexes? How many clustered indexes can be created on a table? I create a separate index on each column of a table. what are the advantages and disadvantages of this approach?

Indexes in SQL Server are similar to the indexes in books. They help SQL Server retrieve the data quicker.

Indexes are of two types. Clustered indexes and non-clustered indexes. When you craete a clustered index on a table, all the rows in the table are stored in the order of the clustered index key. So, there can be only one clustered index per table. Non-clustered indexes have their own storage separate from the table data storage. Non-clustered indexes are stored as B-tree structures (so do clustered indexes), with the leaf level nodes having the index key and it's row locater. The row located could be the RID or the Clustered index key, depending up on the absence or presence of clustered index on the table.

If you create an index on each column of a table, it improves the query performance, as the query optimizer can choose from all the existing indexes to come up with an efficient execution plan. At the same t ime, data modification operations (such as INSERT, UPDATE, DELETE) will become slow, as every time data changes in the table, all the indexes need to be updated. Another disadvantage is that, indexes need disk space, the more indexes you have, more disk space is used.

What are cursors? Explain different types of cursors. What are the disadvantages of cursors? How can you avoid cursors?

Cursors allow row-by-row prcessing of the resultsets.

Types of cursors: Static, Dynamic, Forward-only, Keyset-driven. See books online for more information.

Disadvantages of cursors: Each time you fetch a row from the cursor, it results in a network roundtrip, where as a normal SELECT query makes only one rowundtrip, however large the resultset is. Cursors are also costly because they require more resources and temporary storage (results in more IO operations). Furthere, there are restrictions on the SELECT statements that can be used with some types of cursors.

Most of the times, set based operations can be used instead of cursors. Here is an example:

If you have to give a flat hike to your employees using the following criteria:

Salary between 30000 and 40000 -- 5000 hike
Salary between 40000 and 55000 -- 7000 hike
Salary between 55000 and 65000 -- 9000 hike

In this situation many developers tend to use a cursor, determine each employee's salary and update his salary according to the above formula. But the same can be achieved by multiple update statements or can be combined in a single UPDATE statement as shown below:
UPDATE tbl_emp SET salary =
CASE WHEN salary BETWEEN 30000 AND 40000 THEN salary + 5000
WHEN salary BETWEEN 40000 AND 55000 THEN salary + 7000
WHEN salary BETWEEN 55000 AND 65000 THEN salary + 10000
END



Another situation in which developers tend to use cursors: You need to call a stored procedure when a column in a particular row meets certain condition. You don't have to use cursors for this. This can be achieved using WHILE loop, as long as there is a unique key to identify each row. For examples of using WHILE loop for row by row processing,

What is a join and explain different types of joins?

Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table.

Types of joins: INNER JOINs, OUTER JOINs, CROSS JOINs. OUTER JOINs are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.

What is a Stored Procedure?

Its nothing but a set of T-SQL statements combined to perform a single task of several tasks. Its basically like a Macro so when you invoke the Stored procedure, you actually run a set of statements.

What is the basic difference between clustered and a non-clustered index?

The difference is that, Clustered index is unique for any given table and we can have only one clustered index on a table. The leaf level of a clustered index is the actual data and the data is resorted in case of clustered index. Whereas in case of non-clustered index the leaf level is actually a pointer to the data in rows so we can have as many non-clustered indexes as we can on the db.

What are cursors?

Well cursors help us to do an operation on a set of data that we retreive by commands such as Select columns from table. For example : If we have duplicate records in a table we can remove it by declaring a cursor which would check the records during retreival one by one and remove rows which have duplicate values.

Which TCP/IP port does SQL Server run on?

SQL Server runs on port 1433 but we can also change it for better security.

Can we use Truncate command on a table which is referenced by FOREIGN KEY?

No. We cannot use Truncate command on a table with Foreign Key because of referential integrity.

What is the use of DBCC commands?

DBCC stands for database consistency checker. We use these commands to check the consistency of the databases, i.e., maintenance, validation task and status checks.

What is the difference between a HAVING CLAUSE and a WHERE CLAUSE?

Having Clause is basically used only with the GROUP BY function in a query. WHERE Clause is applied to each row before they are part of the GROUP BY function in a query.

What is a Linked Server?

Linked Servers is a concept in SQL Server by which we can add other SQL Server to a Group and query both the SQL Server dbs using T-SQL Statements.

Can you link only other SQL Servers or any database servers such as Oracle?

We can link any server provided we have the OLE-DB provider from Microsoft to allow a link. For Oracle we have a OLE-DB provider for oracle that microsoft provides to add it as a linked server to the sql server group.

What is BCP? When do we use it?

BulkCopy is a tool used to copy huge amount of data from tables and views. But it won’t copy the structures of the same.

SQL Interview Questions

What is ON DELETE CASCADE?

When ON DELETE CASCADE is specified Oracle maintains referential integrity by automatically removing dependent foreign key values if a referenced primary or unique key value is removed

What is the difference between TRUNCATE and DELETE commands?

Both will result in deleting all the rows in the table .TRUNCATE call cannot be rolled back as it is a DDL command and all memory space for that table is released back to the server. TRUNCATE is much faster.Whereas DELETE call is an DML command and can be rolled back.

What the difference between UNION and UNIONALL?

Union will remove the duplicate rows from the result set while Union all does'nt.

Which system table contains information on constraints on all the tables created ?
USER_CONSTRAINTS,
system table contains information on constraints on all the tables created

What is the difference between oracle,sql and sql server ?

* Oracle is based on RDBMS.
* SQL is Structured Query Language.
* SQL Server is another tool for RDBMS provided by MicroSoft.

What is the difference between SQL and SQL Server ?

SQLServer is an RDBMS just like oracle,DB2 from Microsoft
whereas
Structured Query Language (SQL), pronounced "sequel", is a language that provides an interface to relational database systems. It was developed by IBM in the 1970s for use in System R. SQL is a de facto standard, as well as an ISO and ANSI standard. SQL is used to perform various operations on RDBMS.

What is diffrence between Co-related sub query and nested sub query?

Correlated subquery runs once for each row selected by the outer query. It contains a reference to a value from the row selected by the outer query.

Nested subquery runs only once for the entire nesting (outer) query. It does not contain any reference to the outer query row.

For example,

Correlated Subquery:

select e1.empname, e1.basicsal, e1.deptno from emp e1 where e1.basicsal = (select max(basicsal) from emp e2 where e2.deptno = e1.deptno)

Nested Subquery:

select empname, basicsal, deptno from emp where (deptno, basicsal) in (select deptno, max(basicsal) from emp group by deptno)

What is database?
A database is a collection of data that is organized so that itscontents can easily be accessed, managed and updated. open this url : http://www.webopedia.com/TERM/d/database.html

How can i hide a particular table name of our schema?
you can hide the table name by creating synonyms.

e.g) you can create a synonym y for table x

create synonym y for x;

What is difference between DBMS and RDBMS?
The main difference of DBMS & RDBMS is

RDBMS have Normalization. Normalization means to refining the redundant and maintain the stablization.
the DBMS hasn't normalization concept.

What are the advantages and disadvantages of primary key and foreign key in SQL?

Primary key

Advantages

1) It is a unique key on which all the other candidate keys are functionally dependent

Disadvantage

1) There can be more than one keys on which all the other attributes are dependent on.

Foreign Key

Advantage

1)It allows refrencing another table using the primary key for the other table

Which date function is used to find the difference between two dates?
datediff

for Eg: select datediff (dd,'2-06-2007','7-06-2007')

output is 5


What is denormalization and when would you go for it?

As the name indicates, denormalization is the reverse process of normalization. It's the controlled introduction of redundancy in to the database design. It helps improve the query performance as the number of joins could be reduced.

What's the difference between a primary key and a unique key?

Both primary key and unique enforce uniqueness of the column on which they are defined. But by default primary key creates a clustered index on the column, where are unique creates a nonclustered index by default. Another major difference is that, primary key doesn't allow NULLs, but unique key allows one NULL only.

November 14, 2009

Depth First Polymorphism

Consider the following class:

public class Polyseme {
public static class Top {
public void f(Object o) {
System.out.println("Top.f(Object)");
}
public void f(String s) {
System.out.println("Top.f(String)");
}
}
public static void main(String[] args) {
Top top = new Top();
top.f(new java.util.Vector());
top.f("hello");
top.f((Object)"bye");
}
}

Java looks for the method with the "narrowest" matching class for the parameter objects. Therefore, the output from running this class is:

Top.f(Object)
Top.f(String)
Top.f(Object)

In Java, the virtual machine tries to find a matching method for your parameters, starting at the top of the hierarchy and moving down. Say we have the following classes:

public class BreadthFirst {
public static class Top {
public void f(Object o) {
System.out.println("Top.f(Object)");
}
}
public static class Middle extends Top {
public void f(String s) {
System.out.println("Middle.f(String)");
}
}
public static void main(String[] args) {
Top top = new Middle();
top.f(new java.util.Vector());
top.f("hello");
top.f((Object)"bye");
}
}

The virtual machine will thus start at Top and check if there are any methods which would accept String.class or Object.class, and indeed, Top.f(Object) would handle all those parameters. The output is therefore the following:

Top.f(Object)
Top.f(Object)
Top.f(Object)

We could "fix" this by overriding f(Object) and using instanceof to call the correct f() method (brrr - I'd rather get stuck on the N2 than do that [for those not living in Cape Town, the N2 is notoriously dangerous, you either get shot at or in or with if your car breaks down])

public class BreadthFirstFix {
public static class Top {
public void f(Object o) {
System.out.println("Top.f(Object)");
}
}
public static class Middle extends Top {
public void f(Object o) {
if (o instanceof String)
f((String)o);
else
super.f(o);
}
public void f(String s) {
System.out.println("Middle.f(String)");
}
}
public static void main(String[] args) {
Top top = new Middle();
top.f(new java.util.Vector());
top.f("hello");
top.f((Object)"bye");
}
}

The output would now look as we would expect:

Top.f(Object)
Middle.f(String)
Middle.f(String)

This might have the correct effect, but it does mean that we have to have such a silly "instanceof" in all the subclasses. If we are designing a OO framework we want to have our clients subclass our classes without having to do acrobatics to achieve this.

Christoph Jung mentioned this problem with Java to me a few weeks ago and we thought of some code you could put at the highest level class that uses reflection to start at the lowest class and then tries to match the method to the type before moving up the hierarchy. I call this "depth-first-polymorphism".

import java.lang.reflect.*;
public class DepthFirst {
public static class Top {
private Method getPolymorphicMethod(Object param) {
try {
Class cl = getClass(); // the bottom-most class
// we start at the bottom and work our way up
Class[] paramTypes = {param.getClass()};
while(!cl.equals(Top.class)) {
try {
// this way we find the actual method
return cl.getDeclaredMethod("f", paramTypes);
} catch(NoSuchMethodException ex) {}
cl = cl.getSuperclass();
}
return null;
}
catch(RuntimeException ex) { throw ex; }
catch(Exception ex) { return null; }
}
public void f(Object object) {
Method downPolymorphic = getPolymorphicMethod(object);
if (downPolymorphic == null) {
System.out.println("Top.f(Object)");
} else {
try {
downPolymorphic.invoke(this, new Object[] {object});
}
catch(RuntimeException ex) { throw ex; }
catch(Exception ex) {
throw new RuntimeException(ex.toString());
}
}
}
}
public static class Middle extends Top {
public void f(String s) {
System.out.println("Middle.f(String)");
}
}
public static class Bottom extends Middle {
public void f(Integer i) {
System.out.println("Bottom.f(Integer)");
}
}
public static class RockBottom extends Bottom {
public void f(String s) {
System.out.println("RockBottom.f(String)");
}
}
public static void main(String[] args) {
Top top = new RockBottom();
top.f(new java.util.Vector());
top.f("hello");
top.f(new Integer(42));
top = new Bottom();
top.f(new java.util.Vector());
top.f("hello");
top.f(new Integer(42));
}
}

The answer is this time:

Top.f(Object)
RockBottom.f(String
Bottom.f(Integer)
Top.f(Object)
Middle.f(String)
Bottom.f(Integer)

When should you use this technique? Only if you have a lot of specific type handlers as subclasses of a common superclass where it would make sense to add such a depth-first invoker. You can probably extract this functionality and put it in a separate class. If you use this commercially please do the exception handling correctly, I didn't bother in my example, in preparation for when I change my logo to "The C# Specialists"

November 13, 2009

Fail fast and fail safe iterator

Fail Fast :
When we iterate over a collection, if during iteration, we modify the collection , then the iteration halts and we get "ConcurrentModificationException" (e.g Hashtable)

Fail Safe :
During iteration, a separate copy of the colleciton object is created and iteration occurs on that. So if we modify it during the iteration process, it wont throw an exception.(e.g HashMap)

Best Explanation (with example) : http://www.certpal.com/blogs/2009/09/iterators-fail-fast-vs-fail-safe/

How to Make a Java Class Immutable

Making a class immutable

Immutability must be familiar to every one when we talk about String & StringBuffer classes in java. Strings are considered immutable because the values contained in the reference variable cannot be changed. Whereas String Buffer is considered mutable because the value in a string buffer can be changed (i.e. mutable).

However I always thought how to make our user defined classes as immutable though I am unaware as to why any one would need this.

The reason perhaps might be clear once we have a look at the code.

Now in order to make a class immutable we must restrict changing the state of the class object by any means. This in turn means avoiding an assignment to a variable. We can achieve this through a final modifier. To further restrict the access we can use a private access modifier. Above do not provide any method where we modify the instance variables.

Still done? No. How if some body creates a sub class from our up till now immutable class? Yes here lies the problem. The new subclass can contain methods, which over ride our base class (immutable class) methods. Here he can change the variable values.

Hence make the methods in the class also final. Or a better approach. Make the immutable class itself final. Hence cannot make any sub classes, so no question of over ridding.

The following code gives a way to make the class immutable.

/*
This code demonstrates the way to make a class immutable
*/

// The immutable class which is made final
final class MyImmutableClass
{
// instance var are made private & final to restrict the access

private final int count;
private final double value;

// Constructor where we can provide the constant value
public MyImmutableClass(int paramCount,double paramValue)
{
count = paramCount;
value = paramValue;
}

// provide only methods which return the instance var
// & not change the values

public int getCount()
{
return count;
}

public double getValue()
{
return value;
}
}

// class TestImmutable
public class TestImmutable
{
public static void main(String[] args)
{
MyImmutableClass obj1 = new MyImmutableClass(3,5);

System.out.println(obj1.getCount());
System.out.println(obj1.getValue());

// there is no way to change the values of count & value-
// no method to call besides getXX, no subclassing, no public access to var -> Immutable
}
}

The possible use of immutable classes would be a class containing a price list represented for a set of products.
Otherwise also this represents a good design.

http://www.sap-img.com/java/how-to-make-a-java-class-immutable.htm

What is ant

What is Ant? A simple definition might state that Ant is a Java-based build tool. Of course that definition may just raise the question in your mind "What is a build tool?". To answer that question, consider what is required to build a software system. Typically, there is much more to building software than just typing in and then compiling the source code. There is a number of steps required to transform the source into a deployable and useable software solution. The following is a hypothetical build process you might use with a simple software system

1. Get the source. You may need to download or fetch the source from a source code repository. For this, you might need to know the tag or version of the source code you want to build.
2. Prepare a build area. You will probably want to create a set of directories, perhaps according to some standardized directory layout.
3. Configure the build. In this step, you will determine what optional components can be built based on the current environment. You might want to set build numbers and version numbers to be included in the build.
4. Validate the source code. You may have a standard style guide and you wish to ensure all code conforms to this before you build a release.
5. Compile the source code
6. Build the compiled code into libraries potentially including non-code resources such as properties, images and sound files.
7. Run the system's tests to validate the build.
8. Build the documentation for the software. This may range from something as simple as collecting text files up to processing content through some form of publishing system to produce the documentation in its final form
9. Package up all of the components of the software – code, resources, images, documentation, etc. – into a deployable package. You might need to produce several packages in different formats for different target users
10. Deploy the software to some standard location for use or distribution

This is a high-level view of a software build process. A real-life build process may of course require many more and varied steps. Each of these steps may involve many individual operations.

If you try to use a manual process for building your software you would find it to be tedious, error prone and, in general, not very repeatable. You might forget to set the version number or to provide a tar file for Unix users. You might change the directory structure, confusing users who upgrade from the previous version of the software. Even worse, you may forget to test the software and ship a version that may not even work. Such ad-hoc build processes are always a source of problems and the best solution is to automate the build process. The tools you use to automate the build process are known, unsurprisingly, as build tools. Ant is such a tool.

In general, a build tool allows developers and build managers to describe the build process. In a build, some steps may need to be executed before others or they may be independent of others. In the example above, while the steps are laid out as a linear sequence, there are certain dependencies evident. Obviously, the source cannot be compiled until it has been fetched and the directory structure has been built. The directory structure, however, can be created before, or even while the source is being fetched. Fetching the source may be an expensive operation – perhaps accessing a remote server – and you may not want to do that every time you build the software. A build tool helps by only performing the operations that are required.

In summary, a build tool provides a mechanism to describe the operations and structure of the build process. When the build tool is invoked it uses this description, examines the current state of affairs to determine the steps that need to be executed and in which order, and then manages the execution of those steps.

Often developers will start the automation of their build process with scripting tools supplied by the operating system. Depending on your platform that might be a shell script, a batch file, a Perl script or, indeed, whatever scripting language is your particular preference. These scripts can be quite sophisticated but they usually fail to capture the structure of the build process and can often become a maintenance headache. Each time your code is changed, say adding a new source directory, you will need to update the build script. Since the build scripts are ad-hoc, new developers on your team may not understand them. Further, in many instances, such scripts perform all of the steps of a build even when only some are required. That is OK for build managers who will want clean builds but quickly becomes difficult for developers who may do many builds in a development session and need fast, incremental builds.
Java-Based

Those of you who are familiar with make will realize that the above description of a build tool is also satisfied by make. That isn't surprising since make is also a build tool. Makefiles provide the build description and make then organizes the execution of shell commands to build the software. One major difference between Ant and make is that Ant executes tasks implemented as Java classes. Make executes commands from the operating system's shell.

Being Java-based means a few different things for Ant. Ant largely inherits Java's platform independence. This means that Ant's build files can be easily moved from one platform to another. If you can ensure a homogenous build, development and deployment environment for the life of your project, platform independence may not seem important. For many softwre developments, however, development and deployment environments may be quite different. Open source Java projects, for example, have to support a number of target platforms. An Ant build file allows such projects to have and maintain a single build description. Even in closed source developments, the development and deployment platforms may be different. It is not uncommon for the development environments to be PC-based while production may use a Sun or IBM high-end Unix server. If these environments were to use different build descriptions, there is always the possibility, even the inevitability, of these descriptions getting out of sync. You tend to encounter these problems when production deployment fails, perhaps subtly, and it is usually an unpleasant discovery. Using the same build description throughout the project reduces the opportunity for such problems to occur.

Platform independence can sometimes be limiting, so some Ant tasks give access to the facilities of the underlying operating system. That puts the choice in the hands of the build file developers. Even when this is necessary, Ant will allow you to manage the bulk of the build process in a platform independent way, augmented with platform dependent sections.

Another aspect of Ant that is Java-Based is that the primitive build steps, known as tasks in Ant, are built in Java. These tasks can be loaded at runtime, so developers may extend Ant by writing new Tasks. You will also find many of the tasks that come with Ant are written to deal with the typical structure of Java projects. For example, the Java compilation task understands the directory structures involved in the Java package concept. It is able to compile all code in a source tree in a single operation.

@ Annotations

Annotations provide data about a program that is not part of the program itself. They have no direct effect on the operation of the code they annotate.

Annotations have a number of uses, among them:

* Information for the compiler — Annotations can be used by the compiler to detect errors or suppress warnings.

* Compiler-time and deployment-time processing — Software tools can process annotation information to generate code, XML files, and so forth.

* Runtime processing — Some annotations are available to be examined at runtime.

Annotations can be applied to a program's declarations of classes, fields, methods, and other program elements.

The annotation appears first, often (by convention) on its own line, and may include elements with named or unnamed values:

@Author(
name = "Pragya Rawal",
date = "3/3/2003"
)
class MyClass() { }

or

@SuppressWarnings(value = "unchecked")
void myMethod() { }

If there is just one element named "value," then the name may be omitted, as in:

@SuppressWarnings("unchecked")
void myMethod() { }

Also, if an annotation has no elements, the parentheses may be omitted, as in:

@Override
void mySuperMethod() { }

Documentation
Many annotations replace what would otherwise have been comments in code.

Suppose that a software group has traditionally begun the body of every class with comments providing important information:

public class Generation3List extends Generation2List {

// Author: Pragya Rawal
// Date: 3/17/2012
// Current revision: 6
// Last modified: 4/12/2010
// By: Pragya

// class code goes here

}

To add this same metadata with an annotation, you must first define the annotation type. The syntax for doing this is:

@interface ClassPreamble {
String author();
String date();
int currentRevision() default 1;
String lastModified() default "N/A";
String lastModifiedBy() default "N/A";
String[] reviewers(); // Note use of array
}

The annotation type definition looks somewhat like an interface definition where the keyword interface is preceded by the @ character (@ = "AT" as in Annotation Type). Annotation types are, in fact, a form of interface, which will be covered in a later lesson. For the moment, you do not need to understand interfaces.

The body of the annotation definition above contains annotation type element declarations, which look a lot like methods. Note that they may define optional default values.

Once the annotation type has been defined, you can use annotations of that type, with the values filled in, like this:

@ClassPreamble (
author = "Pragya Rawal",
date = "3/17/2002",
currentRevision = 6,
lastModified = "4/12/2004",
lastModifiedBy = "Pragya Rawal",
reviewers = {"A", "B", "C"} // Note array notation
)
public class Generation3List extends Generation2List {

// class code goes here

}

Note: To make the information in @ClassPreamble appear in Javadoc-generated documentation, you must annotate the @ClassPreamble definition itself with the @Documented annotation:

import java.lang.annotation.*; // import this to use @Documented

@Documented
@interface ClassPreamble {

// Annotation element definitions

}

Annotations Used by the Compiler
There are three annotation types that are predefined by the language specification itself: @Deprecated, @Override, and @SuppressWarnings.

@Deprecated—the @Deprecated annotation indicates that the marked element is deprecated and should no longer be used. The compiler generates a warning whenever a program uses a method, class, or field with the @Deprecated annotation. When an element is deprecated, it should also be documented using the Javadoc @deprecated tag, as shown in the following example. The use of the "@" symbol in both Javadoc comments and in annotations is not coincidental—they are related conceptually. Also, note that the Javadoc tag starts with a lowercase "d" and the annotation starts with an uppercase "D".

// Javadoc comment follows
/**
* @deprecated
* explanation of why it was deprecated
*/
@Deprecated
static void deprecatedMethod() { }
}

@Override—the @Override annotation informs the compiler that the element is meant to override an element declared in a superclass (overriding methods will be discussed in the the lesson titled "Interfaces and Inheritance").

// mark method as a superclass method
// that has been overridden
@Override
int overriddenMethod() { }

While it's not required to use this annotation when overriding a method, it helps to prevent errors. If a method marked with @Override fails to correctly override a method in one of its superclasses, the compiler generates an error.

@SuppressWarnings—the @SuppressWarnings annotation tells the compiler to suppress specific warnings that it would otherwise generate. In the example below, a deprecated method is used and the compiler would normally generate a warning. In this case, however, the annotation causes the warning to be suppressed.

// use a deprecated method and tell
// compiler not to generate a warning
@SuppressWarnings("deprecation")
void useDeprecatedMethod() {
objectOne.deprecatedMethod(); //deprecation warning - suppressed
}

Every compiler warning belongs to a category. The Java Language Specification lists two categories: "deprecation" and "unchecked." The "unchecked" warning can occur when interfacing with legacy code written before the advent of generics (discussed in the lesson titled "Generics"). To suppress more than one category of warnings, use the following syntax:

@SuppressWarnings({"unchecked", "deprecation"})

Annotation Processing
The more advanced uses of annotations include writing an annotation processor that can read a Java program and take actions based on its annotations. It might, for example, generate auxiliary source code, relieving the programmer of having to create boilerplate code that always follows predictable patterns. To facilitate this task, release 5.0 of the JDK includes an annotation processing tool, called apt. In release 6 of the JDK, the functionality of apt is a standard part of the Java compiler.

To make annotation information available at runtime, the annotation type itself must be annotated with @Retention(RetentionPolicy.RUNTIME), as follows:

import java.lang.annotation.*;

@Retention(RetentionPolicy.RUNTIME)
@interface AnnotationForRuntime {

// Elements that give information
// for runtime processing

}

Question 1: What is wrong with the following interface:

public interface House {
@Deprecated
public void open();
public void openFrontDoor();
public void openBackDoor();
}

Answer 1:The documentation should reflect why open is deprecated and what to use instead. For example:

public interface House {
/**
* @deprecated use of open is discouraged, use
* openFrontDoor or openBackDoor instead.
*/
@Deprecated
public void open();
public void openFrontDoor();
public void openBackDoor();
}

Question 2: Consider this implementation of the House interface, shown in Question 1.

public class MyHouse implements House {
public void open() {}
public void openFrontDoor() {}
public void openBackDoor() {}
}

If you compile this program, the compiler complains that open has been deprecated (in the interface). What can you do to get rid of that warning?

Answer 2: You can deprecate the implementation of open:

public class MyHouse implements House {
//The documentation is inherited from the interface.
@Deprecated
public void open() {}
public void openFrontDoor() {}
public void openBackDoor() {}
}

Alternatively, you can suppress the warning:

public class MyHouse implements House {
@SuppressWarnings("deprecation")
public void open() {}
public void openFrontDoor() {}
public void openBackDoor() {}
}

http://java.sun.com/docs/books/tutorial/java/javaOO/QandE/annotations-answers.html

Overriding and Hiding Methods (Form SUN java tutorials)

Instance Methods
An instance method in a subclass with the same signature (name, plus the number and the type of its parameters) and return type as an instance method in the superclass overrides the superclass's method.

The ability of a subclass to override a method allows a class to inherit from a superclass whose behavior is "close enough" and then to modify behavior as needed. The overriding method has the same name, number and type of parameters, and return type as the method it overrides. An overriding method can also return a subtype of the type returned by the overridden method. This is called a covariant return type.

When overriding a method, you might want to use the @Override annotation that instructs the compiler that you intend to override a method in the superclass. If, for some reason, the compiler detects that the method does not exist in one of the superclasses, it will generate an error. For more information on @Override, see Annotations.

Class Methods
If a subclass defines a class method with the same signature as a class method in the superclass, the method in the subclass hides the one in the superclass.

The distinction between hiding and overriding has important implications. The version of the overridden method that gets invoked is the one in the subclass. The version of the hidden method that gets invoked depends on whether it is invoked from the superclass or the subclass. Let's look at an example that contains two classes. The first is Animal, which contains one instance method and one class method:

public class Animal {
public static void testClassMethod() {
System.out.println("The class method in Animal.");
}
public void testInstanceMethod() {
System.out.println("The instance method in Animal.");
}
}

The second class, a subclass of Animal, is called Cat:

public class Cat extends Animal {
public static void testClassMethod() {
System.out.println("The class method in Cat.");
}
public void testInstanceMethod() {
System.out.println("The instance method in Cat.");
}

public static void main(String[] args) {
Cat myCat = new Cat();
Animal myAnimal = myCat;
Animal.testClassMethod();
myAnimal.testInstanceMethod();
}
}

The Cat class overrides the instance method in Animal and hides the class method in Animal. The main method in this class creates an instance of Cat and calls testClassMethod() on the class and testInstanceMethod() on the instance.

The output from this program is as follows:

The class method in Animal.
The instance method in Cat.

As promised, the version of the hidden method that gets invoked is the one in the superclass, and the version of the overridden method that gets invoked is the one in the subclass.
Modifiers
The access specifier for an overriding method can allow more, but not less, access than the overridden method. For example, a protected instance method in the superclass can be made public, but not private, in the subclass.

You will get a compile-time error if you attempt to change an instance method in the superclass to a class method in the subclass, and vice versa.

http://java.sun.com/docs/books/tutorial/java/IandI/override.html

November 12, 2009

Struts :)

1.What is MVC?

Model-View-Controller (MVC) is a design pattern put together to help control change. MVC decouples interface from business logic and data.

* Model : The model contains the core of the application's functionality. The model encapsulates the state of the application. Sometimes the only functionality it contains is state. It knows nothing about the view or controller.

* View: The view provides the presentation of the model. It is the look of the application. The view can access the model getters, but it has no knowledge of the setters. In addition, it knows nothing about the controller. The view should be notified when changes to the model occur.

* Controller:The controller reacts to the user input. It creates and sets the model.


2.What is a framework?

A framework is made up of the set of classes which allow us to use a library in a best possible way for a specific requirement.

3.What is Struts framework?

Struts framework is an open-source framework for developing the web applications in Java EE, based on MVC-2 architecture. It uses and extends the Java Servlet API. Struts is robust architecture and can be used for the development of application of any size. Struts framework makes it much easier to design scalable, reliable Web applications with Java.

4.What are the components of Struts?

Struts components can be categorize into Model, View and Controller:

* Model: Components like business logic /business processes and data are the part of model.
* View: HTML, JSP are the view components.
* Controller: Action Servlet of Struts is part of Controller components which works as front controller to handle all the requests.

5.What are the core classes of the Struts Framework?

Struts is a set of cooperating classes, servlets, and JSP tags that make up a reusable MVC 2 design.

* JavaBeans components for managing application state and behavior.
* Event-driven development (via listeners as in traditional GUI development).
* Pages that represent MVC-style views; pages reference view roots via the JSF component tree.

6.What is ActionServlet?

ActionServlet is a simple servlet which is the backbone of all Struts applications. It is the main Controller component that handles client requests and determines which Action will process each received request. It serves as an Action factory – creating specific Action classes based on user’s request.

7.What is role of ActionServlet?

ActionServlet performs the role of Controller:

* Process user requests
* Determine what the user is trying to achieve according to the request
* Pull data from the model (if necessary) to be given to the appropriate view,
* Select the proper view to respond to the user
* Delegates most of this grunt work to Action classes
* Is responsible for initialization and clean-up of resources


8.What is the ActionForm?

ActionForm is javabean which represents the form inputs containing the request parameters from the View referencing the Action bean.

9.What are the important methods of ActionForm?

The important methods of ActionForm are : validate() & reset().

10.Describe validate() and reset() methods ?

validate() : Used to validate properties after they have been populated; Called before FormBean is handed to Action. Returns a collection of ActionError as ActionErrors. Following is the method signature for the validate() method.

public ActionErrors validate(ActionMapping mapping,HttpServletRequest request)


reset(): reset() method is called by Struts Framework with each request that uses the defined ActionForm. The purpose of this method is to reset all of the ActionForm's data members prior to the new request values being set.

public void reset() {}


11.What is ActionMapping?

Action mapping contains all the deployment information for a particular Action bean. This class is to determine where the results of the Action will be sent once its processing is complete.

12.How is the Action Mapping specified ?

We can specify the action mapping in the configuration file called struts-config.xml. Struts framework creates ActionMapping object from configuration element of struts-config.xml file


type="submit.SubmitAction"
name="submitForm"
input="/submit.jsp"
scope="request"
validate="true">






13.What is role of Action Class?

An Action Class performs a role of an adapter between the contents of an incoming HTTP request and the corresponding business logic that should be executed to process this request.

14.In which method of Action class the business logic is executed ?

In the execute() method of Action class the business logic is executed.

public ActionForward execute(
ActionMapping mapping,
ActionForm form,
HttpServletRequest request,
HttpServletResponse response)
throws Exception ;


execute() method of Action class:

* Perform the processing required to deal with this request
* Update the server-side objects (Scope variables) that will be used to create the next page of the user interface
* Return an appropriate ActionForward object


Struts is based on model 2 MVC (Model-View-Controller) architecture. Struts controller uses the command design pattern and the action classes use the adapter design pattern. The process() method of the RequestProcessor uses the template method design pattern. Struts also implement the following J2EE design patterns.

* Service to Worker
* Dispatcher View
* Composite View (Struts Tiles)
* Front Controller
* View Helper
* Synchronizer Token

November 11, 2009

Serialization and serialVersionUId

public interface Serializable

Serializability of a class is enabled by the class implementing the java.io.Serializable interface. Classes that do not implement this interface will not have any of their state serialized or deserialized. All subtypes of a serializable class are themselves serializable. The serialization interface has no methods or fields and serves only to identify the semantics of being serializable.

To allow subtypes of non-serializable classes to be serialized, the subtype may assume responsibility for saving and restoring the state of the supertype's public, protected, and (if accessible) package fields. The subtype may assume this responsibility only if the class it extends has an accessible no-arg constructor to initialize the class's state. It is an error to declare a class Serializable if this is not the case. The error will be detected at runtime.

During deserialization, the fields of non-serializable classes will be initialized using the public or protected no-arg constructor of the class. A no-arg constructor must be accessible to the subclass that is serializable. The fields of serializable subclasses will be restored from the stream.

When traversing a graph, an object may be encountered that does not support the Serializable interface. In this case the NotSerializableException will be thrown and will identify the class of the non-serializable object.

Classes that require special handling during the serialization and deserialization process must implement special methods with these exact signatures:

private void writeObject(java.io.ObjectOutputStream out)
throws IOException
private void readObject(java.io.ObjectInputStream in)
throws IOException, ClassNotFoundException;


The writeObject method is responsible for writing the state of the object for its particular class so that the corresponding readObject method can restore it. The default mechanism for saving the Object's fields can be invoked by calling out.defaultWriteObject. The method does not need to concern itself with the state belonging to its superclasses or subclasses. State is saved by writing the individual fields to the ObjectOutputStream using the writeObject method or by using the methods for primitive data types supported by DataOutput.

The readObject method is responsible for reading from the stream and restoring the classes fields. It may call in.defaultReadObject to invoke the default mechanism for restoring the object's non-static and non-transient fields. The defaultReadObject method uses information in the stream to assign the fields of the object saved in the stream with the correspondingly named fields in the current object. This handles the case when the class has evolved to add new fields. The method does not need to concern itself with the state belonging to its superclasses or subclasses. State is saved by writing the individual fields to the ObjectOutputStream using the writeObject method or by using the methods for primitive data types supported by DataOutput.

Serializable classes that need to designate an alternative object to be used when writing an object to the stream should implement this special method with the exact signature:

ANY-ACCESS-MODIFIER Object writeReplace() throws ObjectStreamException;


This writeReplace method is invoked by serialization if the method exists and it would be accessible from a method defined within the class of the object being serialized. Thus, the method can have private, protected and package-private access. Subclass access to this method follows java accessibility rules.

Classes that need to designate a replacement when an instance of it is read from the stream should implement this special method with the exact signature.

ANY-ACCESS-MODIFIER Object readResolve() throws ObjectStreamException;


This readResolve method follows the same invocation rules and accessibility rules as writeReplace.

The serialization runtime associates with each serializable class a version number, called a serialVersionUID, which is used during deserialization to verify that the sender and receiver of a serialized object have loaded classes for that object that are compatible with respect to serialization. If the receiver has loaded a class for the object that has a different serialVersionUID than that of the corresponding sender's class, then deserialization will result in an InvalidClassException. A serializable class can declare its own serialVersionUID explicitly by declaring a field named "serialVersionUID" that must be static, final, and of type long:

ANY-ACCESS-MODIFIER static final long serialVersionUID = 42L;


If a serializable class does not explicitly declare a serialVersionUID, then the serialization runtime will calculate a default serialVersionUID value for that class based on various aspects of the class, as described in the Java(TM) Object Serialization Specification. However, it is strongly recommended that all serializable classes explicitly declare serialVersionUID values, since the default serialVersionUID computation is highly sensitive to class details that may vary depending on compiler implementations, and can thus result in unexpected InvalidClassExceptions during deserialization. Therefore, to guarantee a consistent serialVersionUID value across different java compiler implementations, a serializable class must declare an explicit serialVersionUID value. It is also strongly advised that explicit serialVersionUID declarations use the private modifier where possible, since such declarations apply only to the immediately declaring class--serialVersionUID fields are not useful as inherited members.

November 10, 2009

Stored Procedure in DataBase

A stored procedure is a subroutine available to applications accessing a relational database system. Stored procedures (sometimes called a proc, sproc, StoPro, or SP) are actually stored in the database data dictionary.

Typical uses for stored procedures include data validation (integrated into the database) or access control mechanisms. Furthermore, stored procedures are used to consolidate and centralize logic that was originally implemented in applications. Large or complex processing that might require the execution of several SQL statements is moved into stored procedures, and all applications call the procedures only.

Stored procedures are similar to user-defined functions (UDFs). The major difference is that UDFs can be used like any other expression within SQL statements, whereas stored procedures must be invoked using the CALL statement

CALL procedure(…)

or

EXECUTE procedure(…)

Triggers in DataBase

A database trigger is procedural code that is automatically executed in response to certain events on a particular table or view in a database. The trigger is mostly used for keeping the integrity of the information on the database. For example, when a new record (representing a new worker) added to the employees table, new records should be created also in the tables of the taxes, vacations, and salaries.

The need and the usage

Triggers are commonly used to:

* prevent changes (e.g. prevent an invoice from being changed after it's been mailed out)
* log changes (e.g. keep a copy of the old data)
* audit changes (e.g. keep a log of the users and roles involved in changes)
* enhance changes (e.g. ensure that every change to a record is time-stamped by the server's clock, not the client's)
* enforce business rules (e.g. require that every invoice have at least one line item)
* execute business rules (e.g. notify a manager every time an employee's bank account number changes)
* replicate data (e.g. store a record of every change, to be shipped to another database later)
* enhance performance (e.g. update the account balance after every detail transaction, for faster queries)

Some systems also support non-data triggers, which fire in response to Data Definition Language (DDL) events such as creating tables, or runtime events such as logon, commit, and rollback, and may also be used for auditing purposes.

The major features of database triggers, and their effects, are:

* do not accept parameters or arguments (but may store affected-data in temporary tables)
* cannot perform commit or rollback operations because they are part of the triggering SQL statement (only through autonomous transactions)
* can cancel a requested operation
* can cause mutating table errors, if they are poorly written.

DML Triggers

There are typically three triggering events that cause data triggers to 'fire':

* INSERT event (as a new record is being inserted into the database).
* UPDATE event (as a record is being changed).
* DELETE event (as a record is being deleted).

Structurally, triggers are either "row triggers" or "statement triggers". Row triggers define an action for every row of a table, while statement triggers occur only once per INSERT, UPDATE, or DELETE statement. DML triggers cannot be used to audit data retrieval via SELECT statements, because SELECT is not a triggering event.

Furthermore, there are "BEFORE triggers" and "AFTER triggers" which run in addition to any changes already being made to the database, and "INSTEAD OF trigger" which fully replace the database's normal activity.

Triggers do not accept parameters, but they do receive information in the form of implicit variables. For row-level triggers, these are generally OLD and NEW variables, each of which have fields corresponding to the columns of the affected table or view; for statement-level triggers, something like SQL Server's Inserted and Deleted tables may be provided so the trigger can see all the changes being made.

For data triggers, the general order of operations will be as follows:

1. a statement requests changes on a row: OLD represents the row as it was before the change (or is all-null for inserted rows), NEW represents the row after the changes (or is all-null for deleted rows)
2. each statement-level BEFORE trigger is fired
3. each row-level BEFORE trigger fires, and can modify NEW (but not OLD); each trigger can see NEW as modified by its predecessor, they are chained together
4. if an INSTEAD OF trigger is defined, it is run using OLD and NEW as available at this point
5. if no INSTEAD OF trigger is defined, the database modifies the row according to its normal logic; for updatable views, this may involve modifying one or more other tables to achieve the desired effect; if a view is not updatable, and no INSTEAD OF trigger is provided, an error is raised
6. each row-level AFTER trigger fires, and is given NEW and OLD, but its changes to NEW are either disallowed or disregarded
7. each statement-level AFTER trigger is fired
8. implied triggers are fired, such as referential actions in support of foreign key constraints: on-update or on-delete CASCADE, SET NULL, and SET DEFAULT rules

In ACID databases, an exception raised in a trigger will cause the entire stack of operations to be rolled back, including the original statement.

TOAD

What is Toad for Oracle?

Toad is an industry-standard tool for application development. Using Toad, developers can build, test, and debug PL/SQL packages, procedures, triggers, and functions. TOAD users can create and edit database objects such as tables, views, indexes, constraints, and users. TOAD.s SQL Editor provides an easy and efficient way to write and test scripts and queries, and its powerful data grids provide an easy way to view and edit Oracle data. Use Toad to

*
Create, browse, or alter objects (tables, views, indexes, etc.) including Oracle8 TYPE objects
*
Graphically build, execute, and tune queries
*
Edit and Debug PL/SQL and profile "stored procedures" including functions, packages, and triggers
*
Search for objects
*
Find and fix database problems with constraints, triggers, extents, indexes, and grants Toad utilizes direct Oracle OCI calls for full access to the Oracle API.

http://asktoad.com/DWiki/doku.php/faq/answers/getting_started?DokuWiki=4753bcaa735abe33060cd751e786ba16#what_is_toad_for_oracle

All About JVM

What is a Java Virtual Machine?
To understand the Java virtual machine you must first be aware that you may be talking about any of three different things when you say "Java virtual machine." You may be speaking of:
• the abstract specification,
• a concrete implementation, or
• a runtime instance.
The abstract specification is a concept, described in detail in the book: The Java Virtual Machine Specification, by Tim Lindholm and Frank Yellin. Concrete implementations, which exist on many platforms and come from many vendors, are either all software or a combination of hardware and software. A runtime instance hosts a single running Java application.
Each Java application runs inside a runtime instance of some concrete implementation of the abstract specification of the Java virtual machine. In this book, the term "Java virtual machine" is used in all three of these senses. Where the intended sense is not clear from the context, one of the terms "specification," "implementation," or "instance" is added to the term "Java virtual machine".
The Lifetime of a Java Virtual Machine
A runtime instance of the Java virtual machine has a clear mission in life: to run one Java application. When a Java application starts, a runtime instance is born. When the application completes, the instance dies. If you start three Java applications at the same time, on the same computer, using the same concrete implementation, you'll get three Java virtual machine instances. Each Java application runs inside its own Java virtual machine.
A Java virtual machine instance starts running its solitary application by invoking the main() method of some initial class. The main() method must be public, static, return void, and accept one parameter: a String array. Any class with such a main() method can be used as the starting point for a Java application.
For example, consider an application that prints out its command line arguments:
// On CD-ROM in file jvm/ex1/Echo.java
class Echo {

public static void main(String[] args) {
int len = args.length;
for (int i = 0; i < len; ++i) {
System.out.print(args[i] + " ");
}
System.out.println();
}
}
You must in some implementation-dependent way give a Java virtual machine the name of the initial class that has the main() method that will start the entire application. One real world example of a Java virtual machine implementation is the java program from Sun's Java 2 SDK. If you wanted to run the Echo application using Sun's java on Window98, for example, you would type in a command such as:
java Echo Greetings, Planet.
The first word in the command, "java," indicates that the Java virtual machine from Sun's Java 2 SDK should be run by the operating system. The second word, "Echo," is the name of the initial class. Echo must have a public static method named main() that returns void and takes a String array as its only parameter. The subsequent words, "Greetings, Planet.," are the command line arguments for the application. These are passed to the main() method in the String array in the order in which they appear on the command line. So, for the previous example, the contents of the String array passed to main in Echo are: arg[0] is "Greetings," arg[1] is "Planet."
The main() method of an application's initial class serves as the starting point for that application's initial thread. The initial thread can in turn fire off other threads.
Inside the Java virtual machine, threads come in two flavors: daemon and non- daemon. A daemon thread is ordinarily a thread used by the virtual machine itself, such as a thread that performs garbage collection. The application, however, can mark any threads it creates as daemon threads. The initial thread of an application--the one that begins at main()--is a non- daemon thread.
A Java application continues to execute (the virtual machine instance continues to live) as long as any non-daemon threads are still running. When all non-daemon threads of a Java application terminate, the virtual machine instance will exit. If permitted by the security manager, the application can also cause its own demise by invoking the exit() method of class Runtime or System.
In the Echo application previous, the main() method doesn't invoke any other threads. After it prints out the command line arguments, main() returns. This terminates the application's only non-daemon thread, which causes the virtual machine
The Architecture of the Java Virtual Machine
In the Java virtual machine specification, the behavior of a virtual machine instance is described in terms of subsystems, memory areas, data types, and instructions. These components describe an abstract inner architecture for the abstract Java virtual machine. The purpose of these components is not so much to dictate an inner architecture for implementations. It is more to provide a way to strictly define the external behavior of implementations. The specification defines the required behavior of any Java virtual machine implementation in terms of these abstract components and their interactions.
Figure 5-1 shows a block diagram of the Java virtual machine that includes the major subsystems and memory areas described in the specification. As mentioned in previous chapters, each Java virtual machine has a class loader subsystem: a mechanism for loading types (classes and interfaces) given fully qualified names. Each Java virtual machine also has an execution engine: a mechanism responsible for executing the instructions contained in the methods of loaded classes.


Figure 5-1. The internal architecture of the Java virtual machine.
When a Java virtual machine runs a program, it needs memory to store many things, including bytecodes and other information it extracts from loaded class files, objects the program instantiates, parameters to methods, return values, local variables, and intermediate results of computations. The Java virtual machine organizes the memory it needs to execute a program into several runtime data areas.
Although the same runtime data areas exist in some form in every Java virtual machine implementation, their specification is quite abstract. Many decisions about the structural details of the runtime data areas are left to the designers of individual implementations.
Different implementations of the virtual machine can have very different memory constraints. Some implementations may have a lot of memory in which to work, others may have very little. Some implementations may be able to take advantage of virtual memory, others may not. The abstract nature of the specification of the runtime data areas helps make it easier to implement the Java virtual machine on a wide variety of computers and devices.
Some runtime data areas are shared among all of an application's threads and others are unique to individual threads. Each instance of the Java virtual machine has one method area and one heap. These areas are shared by all threads running inside the virtual machine. When the virtual machine loads a class file, it parses information about a type from the binary data contained in the class file. It places this type information into the method area. As the program runs, the virtual machine places all objects the program instantiates onto the heap. See Figure 5-2 for a graphical depiction of these memory areas.


Figure 5-2. Runtime data areas shared among all threads.
As each new thread comes into existence, it gets its own pc register (program counter) and Java stack. If the thread is executing a Java method (not a native method), the value of the pc register indicates the next instruction to execute. A thread's Java stack stores the state of Java (not native) method invocations for the thread. The state of a Java method invocation includes its local variables, the parameters with which it was invoked, its return value (if any), and intermediate calculations. The state of native method invocations is stored in an implementation-dependent way in native method stacks, as well as possibly in registers or other implementation-dependent memory areas.
The Java stack is composed of stack frames (or frames). A stack frame contains the state of one Java method invocation. When a thread invokes a method, the Java virtual machine pushes a new frame onto that thread's Java stack. When the method completes, the virtual machine pops and discards the frame for that method.
The Java virtual machine has no registers to hold intermediate data values. The instruction set uses the Java stack for storage of intermediate data values. This approach was taken by Java's designers to keep the Java virtual machine's instruction set compact and to facilitate implementation on architectures with few or irregular general purpose registers. In addition, the stack-based architecture of the Java virtual machine's instruction set facilitates the code optimization work done by just-in-time and dynamic compilers that operate at run-time in some virtual machine implementations.
See Figure 5-3 for a graphical depiction of the memory areas the Java virtual machine creates for each thread. These areas are private to the owning thread. No thread can access the pc register or Java stack of another thread.


Figure 5-3. Runtime data areas exclusive to each thread.
Figure 5-3 shows a snapshot of a virtual machine instance in which three threads are executing. At the instant of the snapshot, threads one and two are executing Java methods. Thread three is executing a native method.
In Figure 5-3, as in all graphical depictions of the Java stack in this book, the stacks are shown growing downwards. The "top" of each stack is shown at the bottom of the figure. Stack frames for currently executing methods are shown in a lighter shade. For threads that are currently executing a Java method, the pc register indicates the next instruction to execute. In Figure 5-3, such pc registers (the ones for threads one and two) are shown in a lighter shade. Because thread three is currently executing a native method, the contents of its pc register--the one shown in dark gray--is undefined.
Data Types
The Java virtual machine computes by performing operations on certain types of data. Both the data types and operations are strictly defined by the Java virtual machine specification. The data types can be divided into a set of primitive types and a reference type. Variables of the primitive types hold primitive values, and variables of the reference type hold reference values. Reference values refer to objects, but are not objects themselves. Primitive values, by contrast, do not refer to anything. They are the actual data themselves. You can see a graphical depiction of the Java virtual machine's families of data types in Figure 5-4.


Figure 5-4. Data types of the Java virtual machine.
All the primitive types of the Java programming language are primitive types of the Java virtual machine. Although boolean qualifies as a primitive type of the Java virtual machine, the instruction set has very limited support for it. When a compiler translates Java source code into bytecodes, it uses ints or bytes to represent booleans. In the Java virtual machine, false is represented by integer zero and true by any non-zero integer. Operations involving boolean values use ints. Arrays of boolean are accessed as arrays of byte, though they may be represented on the heap as arrays of byte or as bit fields.
The primitive types of the Java programming language other than boolean form the numeric types of the Java virtual machine. The numeric types are divided between the integral types: byte, short, int, long, and char, and the floating- point types: float and double. As with the Java programming language, the primitive types of the Java virtual machine have the same range everywhere. A long in the Java virtual machine always acts like a 64-bit signed twos complement number, independent of the underlying host platform.
The Java virtual machine works with one other primitive type that is unavailable to the Java programmer: the returnAddress type. This primitive type is used to implement finally clauses of Java programs. The use of the returnAddress type is described in detail in Chapter 18, "Finally Clauses."
The reference type of the Java virtual machine is cleverly named reference. Values of type reference come in three flavors: the class type, the interface type, and the array type. All three types have values that are references to dynamically created objects. The class type's values are references to class instances. The array type's values are references to arrays, which are full-fledged objects in the Java virtual machine. The interface type's values are references to class instances that implement an interface. One other reference value is the null value, which indicates the reference variable doesn't refer to any object.
The Java virtual machine specification defines the range of values for each of the data types, but does not define their sizes. The number of bits used to store each data type value is a decision of the designers of individual implementations. The ranges of the Java virtual machines data type's are shown in Table 5-1. More information on the floating point ranges is given in Chapter 14, "Floating Point Arithmetic."
Type Range
byte 8-bit signed two's complement integer (-27 to 27 - 1, inclusive)
short 16-bit signed two's complement integer (-215 to 215 - 1, inclusive)
int 32-bit signed two's complement integer (-231 to 231 - 1, inclusive)
long 64-bit signed two's complement integer (-263 to 263 - 1, inclusive)
char 16-bit unsigned Unicode character (0 to 216 - 1, inclusive)
float 32-bit IEEE 754 single-precision float
double 64-bit IEEE 754 double-precision float
returnAddress address of an opcode within the same method
reference reference to an object on the heap, or null
Table 5-1. Ranges of the Java virtual machine's data types
Word Size
The basic unit of size for data values in the Java virtual machine is the word--a fixed size chosen by the designer of each Java virtual machine implementation. The word size must be large enough to hold a value of type byte, short, int, char, float, returnAddress, or reference. Two words must be large enough to hold a value of type long or double. An implementation designer must therefore choose a word size that is at least 32 bits, but otherwise can pick whatever word size will yield the most efficient implementation. The word size is often chosen to be the size of a native pointer on the host platform.
The specification of many of the Java virtual machine's runtime data areas are based upon this abstract concept of a word. For example, two sections of a Java stack frame--the local variables and operand stack-- are defined in terms of words. These areas can contain values of any of the virtual machine's data types. When placed into the local variables or operand stack, a value occupies either one or two words.
As they run, Java programs cannot determine the word size of their host virtual machine implementation. The word size does not affect the behavior of a program. It is only an internal attribute of a virtual machine implementation.
The Class Loader Subsystem
The part of a Java virtual machine implementation that takes care of finding and loading types is the class loader subsystem. Chapter 1, "Introduction to Java's Architecture," gives an overview of this subsystem. Chapter 3, "Security," shows how the subsystem fits into Java's security model. This chapter describes the class loader subsystem in more detail and show how it relates to the other components of the virtual machine's internal architecture.
As mentioned in Chapter 1, the Java virtual machine contains two kinds of class loaders: a bootstrap class loader and user-defined class loaders. The bootstrap class loader is a part of the virtual machine implementation, and user-defined class loaders are part of the running Java application. Classes loaded by different class loaders are placed into separate name spaces inside the Java virtual machine.
The class loader subsystem involves many other parts of the Java virtual machine and several classes from the java.lang library. For example, user-defined class loaders are regular Java objects whose class descends from java.lang.ClassLoader. The methods of class ClassLoader allow Java applications to access the virtual machine's class loading machinery. Also, for every type a Java virtual machine loads, it creates an instance of class java.lang.Class to represent that type. Like all objects, user-defined class loaders and instances of class Class reside on the heap. Data for loaded types resides in the method area.
Loading, Linking and Initialization
The class loader subsystem is responsible for more than just locating and importing the binary data for classes. It must also verify the correctness of imported classes, allocate and initialize memory for class variables, and assist in the resolution of symbolic references. These activities are performed in a strict order:
1. Loading: finding and importing the binary data for a type
2. Linking: performing verification, preparation, and (optionally) resolution
a. Verification: ensuring the correctness of the imported type
b. Preparation: allocating memory for class variables and initializing the memory to default values
c. Resolution: transforming symbolic references from the type into direct references.
3. Initialization: invoking Java code that initializes class variables to their proper starting values.
The details of these processes are given Chapter 7, "The Lifetime of a Type."
The Bootstrap Class Loader
Java virtual machine implementations must be able to recognize and load classes and interfaces stored in binary files that conform to the Java class file format. An implementation is free to recognize other binary forms besides class files, but it must recognize class files.
Every Java virtual machine implementation has a bootstrap class loader, which knows how to load trusted classes, including the classes of the Java API. The Java virtual machine specification doesn't define how the bootstrap loader should locate classes. That is another decision the specification leaves to implementation designers.
Given a fully qualified type name, the bootstrap class loader must in some way attempt to produce the data that defines the type. One common approach is demonstrated by the Java virtual machine implementation in Sun's 1.1 JDK on Windows98. This implementation searches a user-defined directory path stored in an environment variable named CLASSPATH. The bootstrap loader looks in each directory, in the order the directories appear in the CLASSPATH, until it finds a file with the appropriate name: the type's simple name plus ".class". Unless the type is part of the unnamed package, the bootstrap loader expects the file to be in a subdirectory of one the directories in the CLASSPATH. The path name of the subdirectory is built from the package name of the type. For example, if the bootstrap class loader is searching for class java.lang.Object, it will look for Object.class in the java\lang subdirectory of each CLASSPATH directory.
In 1.2, the bootstrap class loader of Sun's Java 2 SDK only looks in the directory in which the system classes (the class files of the Java API) were installed. The bootstrap class loader of the implementation of the Java virtual machine from Sun's Java 2 SDK does not look on the CLASSPATH. In Sun's Java 2 SDK virtual machine, searching the class path is the job of the system class loader, a user-defined class loader that is created automatically when the virtual machine starts up. More information on the class loading scheme of Sun's Java 2 SDK is given in Chapter 8, "The Linking Model."
User-Defined Class Loaders
Although user-defined class loaders themselves are part of the Java application, four of the methods in class ClassLoader are gateways into the Java virtual machine:
// Four of the methods declared in class java.lang.ClassLoader:
protected final Class defineClass(String name, byte data[],
int offset, int length);
protected final Class defineClass(String name, byte data[],
int offset, int length, ProtectionDomain protectionDomain);
protected final Class findSystemClass(String name);
protected final void resolveClass(Class c);
Any Java virtual machine implementation must take care to connect these methods of class ClassLoader to the internal class loader subsystem.
The two overloaded defineClass() methods accept a byte array, data[], as input. Starting at position offset in the array and continuing for length bytes, class ClassLoader expects binary data conforming to the Java class file format--binary data that represents a new type for the running application -- with the fully qualified name specified in name. The type is assigned to either a default protection domain, if the first version of defineClass() is used, or to the protection domain object referenced by the protectionDomain parameter. Every Java virtual machine implementation must make sure the defineClass() method of class ClassLoader can cause a new type to be imported into the method area.
The findSystemClass() method accepts a String representing a fully qualified name of a type. When a user-defined class loader invokes this method in version 1.0 and 1.1, it is requesting that the virtual machine attempt to load the named type via its bootstrap class loader. If the bootstrap class loader has already loaded or successfully loads the type, it returns a reference to the Class object representing the type. If it can't locate the binary data for the type, it throws ClassNotFoundException. In version 1.2, the findSystemClass() method attempts to load the requested type from the system class loader. Every Java virtual machine implementation must make sure the findSystemClass() method can invoke the bootstrap (if version 1.0 or 1.1) or system (if version 1.2 or later) class loader in this way.
The resolveClass() method accepts a reference to a Class instance. This method causes the type represented by the Class instance to be linked (if it hasn't already been linked). The defineClass() method, described previous, only takes care of loading. (See the previous section, "Loading, Linking, and Initialization" for definitions of these terms.) When defineClass() returns a Class instance, the binary file for the type has definitely been located and imported into the method area, but not necessarily linked and initialized. Java virtual machine implementations make sure the resolveClass() method of class ClassLoader can cause the class loader subsystem to perform linking.
The details of how a Java virtual machine performs class loading, linking, and initialization, with user- defined class loaders is given in Chapter 8, "The Linking Model."
Name Spaces
As mentioned in Chapter 3, "Security," each class loader maintains its own name space populated by the types it has loaded. Because each class loader has its own name space, a single Java application can load multiple types with the same fully qualified name. A type's fully qualified name, therefore, is not always enough to uniquely identify it inside a Java virtual machine instance. If multiple types of that same name have been loaded into different name spaces, the identity of the class loader that loaded the type (the identity of the name space it is in) will also be needed to uniquely identify that type.
Name spaces arise inside a Java virtual machine instance as a result of the process of resolution. As part of the data for each loaded type, the Java virtual machine keeps track of the class loader that imported the type. When the virtual machine needs to resolve a symbolic reference from one class to another, it requests the referenced class from the same class loader that loaded the referencing class. This process is described in detail in Chapter 8, "The Linking Model."
The Method Area
Inside a Java virtual machine instance, information about loaded types is stored in a logical area of memory called the method area. When the Java virtual machine loads a type, it uses a class loader to locate the appropriate class file. The class loader reads in the class file--a linear stream of binary data--and passes it to the virtual machine. The virtual machine extracts information about the type from the binary data and stores the information in the method area. Memory for class (static) variables declared in the class is also taken from the method area.
The manner in which a Java virtual machine implementation represents type information internally is a decision of the implementation designer. For example, multi-byte quantities in class files are stored in big- endian (most significant byte first) order. When the data is imported into the method area, however, a virtual machine can store the data in any manner. If an implementation sits on top of a little-endian processor, the designers may decide to store multi-byte values in the method area in little-endian order.
The virtual machine will search through and use the type information stored in the method area as it executes the application it is hosting. Designers must attempt to devise data structures that will facilitate speedy execution of the Java application, but must also think of compactness. If designing an implementation that will operate under low memory constraints, designers may decide to trade off some execution speed in favor of compactness. If designing an implementation that will run on a virtual memory system, on the other hand, designers may decide to store redundant information in the method area to facilitate execution speed. (If the underlying host doesn't offer virtual memory, but does offer a hard disk, designers could create their own virtual memory system as part of their implementation.) Designers can choose whatever data structures and organization they feel optimize their implementations performance, in the context of its requirements.
All threads share the same method area, so access to the method area's data structures must be designed to be thread-safe. If two threads are attempting to find a class named Lava, for example, and Lava has not yet been loaded, only one thread should be allowed to load it while the other one waits.
The size of the method area need not be fixed. As the Java application runs, the virtual machine can expand and contract the method area to fit the application's needs. Also, the memory of the method area need not be contiguous. It could be allocated on a heap--even on the virtual machine's own heap. Implementations may allow users or programmers to specify an initial size for the method area, as well as a maximum or minimum size.
The method area can also be garbage collected. Because Java programs can be dynamically extended via user-defined class loaders, classes can become "unreferenced" by the application. If a class becomes unreferenced, a Java virtual machine can unload the class (garbage collect it) to keep the memory occupied by the method area at a minimum. The unloading of classes--including the conditions under which a class can become "unreferenced"--is described in Chapter 7, "The Lifetime of a Type."
Type Information
For each type it loads, a Java virtual machine must store the following kinds of information in the method area:
• The fully qualified name of the type
• The fully qualified name of the type's direct superclass (unless the type is an interface or class java.lang.Object, neither of which have a superclass)
• Whether or not the type is a class or an interface
• The type's modifiers ( some subset of` public, abstract, final)
• An ordered list of the fully qualified names of any direct superinterfaces
Inside the Java class file and Java virtual machine, type names are always stored as fully qualified names. In Java source code, a fully qualified name is the name of a type's package, plus a dot, plus the type's simple name. For example, the fully qualified name of class Object in package java.lang is java.lang.Object. In class files, the dots are replaced by slashes, as in java/lang/Object. In the method area, fully qualified names can be represented in whatever form and data structures a designer chooses.
In addition to the basic type information listed previously, the virtual machine must also store for each loaded type:
• The constant pool for the type
• Field information
• Method information
• All class (static) variables declared in the type, except constants
• A reference to class ClassLoader
• A reference to class Class
This data is described in the following sections.
The Constant Pool
For each type it loads, a Java virtual machine must store a constant pool. A constant pool is an ordered set of constants used by the type, including literals (string, integer, and floating point constants) and symbolic references to types, fields, and methods. Entries in the constant pool are referenced by index, much like the elements of an array. Because it holds symbolic references to all types, fields, and methods used by a type, the constant pool plays a central role in the dynamic linking of Java programs. The constant pool is described in more detail later in this chapter and in Chapter 6, "The Java Class File."
Field Information
For each field declared in the type, the following information must be stored in the method area. In addition to the information for each field, the order in which the fields are declared by the class or interface must also be recorded. Here's the list for fields:
• The field's name
• The field's type
• The field's modifiers (some subset of public, private, protected, static, final, volatile, transient)
Method Information
For each method declared in the type, the following information must be stored in the method area. As with fields, the order in which the methods are declared by the class or interface must be recorded as well as the data. Here's the list:
• The method's name
• The method's return type (or void)
• The number and types (in order) of the method's parameters
• The method's modifiers (some subset of public, private, protected, static, final, synchronized, native, abstract)
In addition to the items listed previously, the following information must also be stored with each method that is not abstract or native:
• The method's bytecodes
• The sizes of the operand stack and local variables sections of the method's stack frame (these are described in a later section of this chapter)
• An exception table (this is described in Chapter 17, "Exceptions")
Class Variables
Class variables are shared among all instances of a class and can be accessed even in the absence of any instance. These variables are associated with the class--not with instances of the class--so they are logically part of the class data in the method area. Before a Java virtual machine uses a class, it must allocate memory from the method area for each non-final class variable declared in the class.
Constants (class variables declared final) are not treated in the same way as non-final class variables. Every type that uses a final class variable gets a copy of the constant value in its own constant pool. As part of the constant pool, final class variables are stored in the method area--just like non-final class variables. But whereas non-final class variables are stored as part of the data for the type that declares them, final class variables are stored as part of the data for any type that uses them. This special treatment of constants is explained in more detail in Chapter 6, "The Java Class File."
A Reference to Class ClassLoader
For each type it loads, a Java virtual machine must keep track of whether or not the type was loaded via the bootstrap class loader or a user-defined class loader. For those types loaded via a user-defined class loader, the virtual machine must store a reference to the user-defined class loader that loaded the type. This information is stored as part of the type's data in the method area.
The virtual machine uses this information during dynamic linking. When one type refers to another type, the virtual machine requests the referenced type from the same class loader that loaded the referencing type. This process of dynamic linking is also central to the way the virtual machine forms separate name spaces. To be able to properly perform dynamic linking and maintain multiple name spaces, the virtual machine needs to know what class loader loaded each type in its method area. The details of dynamic linking and name spaces are given in Chapter 8, "The Linking Model."
A Reference to Class Class
An instance of class java.lang.Class is created by the Java virtual machine for every type it loads. The virtual machine must in some way associate a reference to the Class instance for a type with the type's data in the method area.
Your Java programs can obtain and use references to Class objects. One static method in class Class, allows you to get a reference to the Class instance for any loaded class:
// A method declared in class java.lang.Class:
public static Class forName(String className);
If you invoke forName("java.lang.Object"), for example, you will get a reference to the Class object that represents java.lang.Object. If you invoke forName("java.util.Enumeration"), you will get a reference to the Class object that represents the Enumeration interface from the java.util package. You can use forName() to get a Class reference for any loaded type from any package, so long as the type can be (or already has been) loaded into the current name space. If the virtual machine is unable to load the requested type into the current name space, forName() will throw ClassNotFoundException.
An alternative way to get a Class reference is to invoke getClass() on any object reference. This method is inherited by every object from class Object itself:
// A method declared in class java.lang.Object:
public final Class getClass();
If you have a reference to an object of class java.lang.Integer, for example, you could get the Class object for java.lang.Integer simply by invoking getClass() on your reference to the Integer object.
Given a reference to a Class object, you can find out information about the type by invoking methods declared in class Class. If you look at these methods, you will quickly realize that class Class gives the running application access to the information stored in the method area. Here are some of the methods declared in class Class:
// Some of the methods declared in class java.lang.Class:
public String getName();
public Class getSuperClass();
public boolean isInterface();
public Class[] getInterfaces();
public ClassLoader getClassLoader();
These methods just return information about a loaded type. getName() returns the fully qualified name of the type. getSuperClass() returns the Class instance for the type's direct superclass. If the type is class java.lang.Object or an interface, none of which have a superclass, getSuperClass() returns null. isInterface() returns true if the Class object describes an interface, false if it describes a class. getInterfaces() returns an array of Class objects, one for each direct superinterface. The superinterfaces appear in the array in the order they are declared as superinterfaces by the type. If the type has no direct superinterfaces, getInterfaces() returns an array of length zero. getClassLoader() returns a reference to the ClassLoader object that loaded this type, or null if the type was loaded by the bootstrap class loader. All this information comes straight out of the method area.
Method Tables
The type information stored in the method area must be organized to be quickly accessible. In addition to the raw type information listed previously, implementations may include other data structures that speed up access to the raw data. One example of such a data structure is a method table. For each non-abstract class a Java virtual machine loads, it could generate a method table and include it as part of the class information it stores in the method area. A method table is an array of direct references to all the instance methods that may be invoked on a class instance, including instance methods inherited from superclasses. (A method table isn't helpful in the case of abstract classes or interfaces, because the program will never instantiate these.) A method table allows a virtual machine to quickly locate an instance method invoked on an object. Method tables are described in detail in Chapter 8, "The Linking Model."
An Example of Method Area Use
As an example of how the Java virtual machine uses the information it stores in the method area, consider these classes:
// On CD-ROM in file jvm/ex2/Lava.java
class Lava {

private int speed = 5; // 5 kilometers per hour

void flow() {
}
}

// On CD-ROM in file jvm/ex2/Volcano.java
class Volcano {

public static void main(String[] args) {
Lava lava = new Lava();
lava.flow();
}
}
The following paragraphs describe how an implementation might execute the first instruction in the bytecodes for the main() method of the Volcano application. Different implementations of the Java virtual machine can operate in very different ways. The following description illustrates one way--but not the only way--a Java virtual machine could execute the first instruction of Volcano's main() method.
To run the Volcano application, you give the name "Volcano" to a Java virtual machine in an implementation-dependent manner. Given the name Volcano, the virtual machine finds and reads in file Volcano.class. It extracts the definition of class Volcano from the binary data in the imported class file and places the information into the method area. The virtual machine then invokes the main() method, by interpreting the bytecodes stored in the method area. As the virtual machine executes main(), it maintains a pointer to the constant pool (a data structure in the method area) for the current class (class Volcano).
Note that this Java virtual machine has already begun to execute the bytecodes for main() in class Volcano even though it hasn't yet loaded class Lava. Like many (probably most) implementations of the Java virtual machine, this implementation doesn't wait until all classes used by the application are loaded before it begins executing main(). It loads classes only as it needs them.
main()'s first instruction tells the Java virtual machine to allocate enough memory for the class listed in constant pool entry one. The virtual machine uses its pointer into Volcano's constant pool to look up entry one and finds a symbolic reference to class Lava. It checks the method area to see if Lava has already been loaded.
The symbolic reference is just a string giving the class's fully qualified name: "Lava". Here you can see that the method area must be organized so a class can be located--as quickly as possible--given only the class's fully qualified name. Implementation designers can choose whatever algorithm and data structures best fit their needs--a hash table, a search tree, anything. This same mechanism can be used by the static forName() method of class Class, which returns a Class reference given a fully qualified name.
When the virtual machine discovers that it hasn't yet loaded a class named "Lava," it proceeds to find and read in file Lava.class. It extracts the definition of class Lava from the imported binary data and places the information into the method area.
The Java virtual machine then replaces the symbolic reference in Volcano's constant pool entry one, which is just the string "Lava", with a pointer to the class data for Lava. If the virtual machine ever has to use Volcano's constant pool entry one again, it won't have to go through the relatively slow process of searching through the method area for class Lava given only a symbolic reference, the string "Lava". It can just use the pointer to more quickly access the class data for Lava. This process of replacing symbolic references with direct references (in this case, a native pointer) is called constant pool resolution. The symbolic reference is resolved into a direct reference by searching through the method area until the referenced entity is found, loading new classes if necessary.
Finally, the virtual machine is ready to actually allocate memory for a new Lava object. Once again, the virtual machine consults the information stored in the method area. It uses the pointer (which was just put into Volcano's constant pool entry one) to the Lava data (which was just imported into the method area) to find out how much heap space is required by a Lava object.
A Java virtual machine can always determine the amount of memory required to represent an object by looking into the class data stored in the method area. The actual amount of heap space required by a particular object, however, is implementation-dependent. The internal representation of objects inside a Java virtual machine is another decision of implementation designers. Object representation is discussed in more detail later in this chapter.
Once the Java virtual machine has determined the amount of heap space required by a Lava object, it allocates that space on the heap and initializes the instance variable speed to zero, its default initial value. If class Lava's superclass, Object, has any instance variables, those are also initialized to default initial values. (The details of initialization of both classes and objects are given in Chapter 7, "The Lifetime of a Type.")
The first instruction of main() completes by pushing a reference to the new Lava object onto the stack. A later instruction will use the reference to invoke Java code that initializes the speed variable to its proper initial value, five. Another instruction will use the reference to invoke the flow() method on the referenced Lava object.
The Heap
Whenever a class instance or array is created in a running Java application, the memory for the new object is allocated from a single heap. As there is only one heap inside a Java virtual machine instance, all threads share it. Because a Java application runs inside its "own" exclusive Java virtual machine instance, there is a separate heap for every individual running application. There is no way two different Java applications could trample on each other's heap data. Two different threads of the same application, however, could trample on each other's heap data. This is why you must be concerned about proper synchronization of multi-threaded access to objects (heap data) in your Java programs.
The Java virtual machine has an instruction that allocates memory on the heap for a new object, but has no instruction for freeing that memory. Just as you can't explicitly free an object in Java source code, you can't explicitly free an object in Java bytecodes. The virtual machine itself is responsible for deciding whether and when to free memory occupied by objects that are no longer referenced by the running application. Usually, a Java virtual machine implementation uses a garbage collector to manage the heap.
Garbage Collection
A garbage collector's primary function is to automatically reclaim the memory used by objects that are no longer referenced by the running application. It may also move objects as the application runs to reduce heap fragmentation.
A garbage collector is not strictly required by the Java virtual machine specification. The specification only requires that an implementation manage its own heap in some manner. For example, an implementation could simply have a fixed amount of heap space available and throw an OutOfMemory exception when that space fills up. While this implementation may not win many prizes, it does qualify as a Java virtual machine. The Java virtual machine specification does not say how much memory an implementation must make available to running programs. It does not say how an implementation must manage its heap. It says to implementation designers only that the program will be allocating memory from the heap, but not freeing it. It is up to designers to figure out how they want to deal with that fact.
No garbage collection technique is dictated by the Java virtual machine specification. Designers can use whatever techniques seem most appropriate given their goals, constraints, and talents. Because references to objects can exist in many places--Java Stacks, the heap, the method area, native method stacks--the choice of garbage collection technique heavily influences the design of an implementation's runtime data areas. Various garbage collection techniques are described in Chapter 9, "Garbage Collection."
As with the method area, the memory that makes up the heap need not be contiguous, and may be expanded and contracted as the running program progresses. An implementation's method area could, in fact, be implemented on top of its heap. In other words, when a virtual machine needs memory for a freshly loaded class, it could take that memory from the same heap on which objects reside. The same garbage collector that frees memory occupied by unreferenced objects could take care of finding and freeing (unloading) unreferenced classes. Implementations may allow users or programmers to specify an initial size for the heap, as well as a maximum and minimum size.
Object Representation
The Java virtual machine specification is silent on how objects should be represented on the heap. Object representation--an integral aspect of the overall design of the heap and garbage collector--is a decision of implementation designers
The primary data that must in some way be represented for each object is the instance variables declared in the object's class and all its superclasses. Given an object reference, the virtual machine must be able to quickly locate the instance data for the object. In addition, there must be some way to access an object's class data (stored in the method area) given a reference to the object. For this reason, the memory allocated for an object usually includes some kind of pointer into the method area.
One possible heap design divides the heap into two parts: a handle pool and an object pool. An object reference is a native pointer to a handle pool entry. A handle pool entry has two components: a pointer to instance data in the object pool and a pointer to class data in the method area. The advantage of this scheme is that it makes it easy for the virtual machine to combat heap fragmentation. When the virtual machine moves an object in the object pool, it need only update one pointer with the object's new address: the relevant pointer in the handle pool. The disadvantage of this approach is that every access to an object's instance data requires dereferencing two pointers. This approach to object representation is shown graphically in Figure 5-5. This kind of heap is demonstrated interactively by the HeapOfFish applet, described in Chapter 9, "Garbage Collection."


Figure 5-5. Splitting an object across a handle pool and object pool.
Another design makes an object reference a native pointer to a bundle of data that contains the object's instance data and a pointer to the object's class data. This approach requires dereferencing only one pointer to access an object's instance data, but makes moving objects more complicated. When the virtual machine moves an object to combat fragmentation of this kind of heap, it must update every reference to that object anywhere in the runtime data areas. This approach to object representation is shown graphically in Figure 5-6.


Figure 5-6. Keeping object data all in one place.
The virtual machine needs to get from an object reference to that object's class data for several reasons. When a running program attempts to cast an object reference to another type, the virtual machine must check to see if the type being cast to is the actual class of the referenced object or one of its supertypes. . It must perform the same kind of check when a program performs an instanceof operation. In either case, the virtual machine must look into the class data of the referenced object. When a program invokes an instance method, the virtual machine must perform dynamic binding: it must choose the method to invoke based not on the type of the reference but on the class of the object. To do this, it must once again have access to the class data given only a reference to the object.
No matter what object representation an implementation uses, it is likely that a method table is close at hand for each object. Method tables, because they speed up the invocation of instance methods, can play an important role in achieving good overall performance for a virtual machine implementation. Method tables are not required by the Java virtual machine specification and may not exist in all implementations. Implementations that have extremely low memory requirements, for instance, may not be able to afford the extra memory space method tables occupy. If an implementation does use method tables, however, an object's method table will likely be quickly accessible given just a reference to the object.
One way an implementation could connect a method table to an object reference is shown graphically in Figure 5-7. This figure shows that the pointer kept with the instance data for each object points to a special structure. The special structure has two components:
• A pointer to the full the class data for the object
• The method table for the object The method table is an array of pointers to the data for each instance method that can be invoked on objects of that class. The method data pointed to by method table includes:
• The sizes of the operand stack and local variables sections of the method's stack
• The method's bytecodes
• An exception table
This gives the virtual machine enough information to invoke the method. The method table include pointers to data for methods declared explicitly in the object's class or inherited from superclasses. In other words, the pointers in the method table may point to methods defined in the object's class or any of its superclasses. More information on method tables is given in Chapter 8, "The Linking Model."


Figure 5-7. Keeping the method table close at hand.
If you are familiar with the inner workings of C++, you may recognize the method table as similar to the VTBL or virtual table of C++ objects. In C++, objects are represented by their instance data plus an array of pointers to any virtual functions that can be invoked on the object. This approach could also be taken by a Java virtual machine implementation. An implementation could include a copy of the method table for a class as part of the heap image for every instance of that class. This approach would consume more heap space than the approach shown in Figure 5-7, but might yield slightly better performance on a systems that enjoy large quantities of available memory.
One other kind of data that is not shown in Figures 5-5 and 5-6, but which is logically part of an object's data on the heap, is the object's lock. Each object in a Java virtual machine is associated with a lock (or mutex) that a program can use to coordinate multi-threaded access to the object. Only one thread at a time can "own" an object's lock. While a particular thread owns a particular object's lock, only that thread can access that object's instance variables. All other threads that attempt to access the object's variables have to wait until the owning thread releases the object's lock. If a thread requests a lock that is already owned by another thread, the requesting thread has to wait until the owning thread releases the lock. Once a thread owns a lock, it can request the same lock again multiple times, but then has to release the lock the same number of times before it is made available to other threads. If a thread requests a lock three times, for example, that thread will continue to own the lock until it has released it three times.
Many objects will go through their entire lifetimes without ever being locked by a thread. The data required to implement an object's lock is not needed unless the lock is actually requested by a thread. As a result, many implementations, such as the ones shown in Figure 5-5 and 5-6, may not include a pointer to "lock data" within the object itself. Such implementations must create the necessary data to represent a lock when the lock is requested for the first time. In this scheme, the virtual machine must associate the lock with the object in some indirect way, such as by placing the lock data into a search tree based on the object's address.
Along with data that implements a lock, every Java object is logically associated with data that implements a wait set. Whereas locks help threads to work independently on shared data without interfering with one another, wait sets help threads to cooperate with one another--to work together towards a common goal.
Wait sets are used in conjunction with wait and notify methods. Every class inherits from Object three "wait methods" (overloaded forms of a method named wait()) and two "notify methods" (notify() and notifyAll()). When a thread invokes a wait method on an object, the Java virtual machine suspends that thread and adds it to that object's wait set. When a thread invokes a notify method on an object, the virtual machine will at some future time wake up one or more threads from that object's wait set. As with the data that implements an object's lock, the data that implements an object's wait set is not needed unless a wait or notify method is actually invoked on the object. As a result, many implementations of the Java virtual machine may keep the wait set data separate from the actual object data. Such implementations could allocate the data needed to represent an object's wait set when a wait or notify method is first invoked on that object by the running application. For more information about locks and wait sets, see Chapter 20, "Thread Synchronization."
One last example of a type of data that may be included as part of the image of an object on the heap is any data needed by the garbage collector. The garbage collector must in some way keep track of which objects are referenced by the program. This task invariably requires data to be kept for each object on the heap. The kind of data required depends upon the garbage collection technique being used. For example, if an implementation uses a mark and sweep algorithm, it must be able to mark an object as referenced or unreferenced. For each unreferenced object, it may also need to indicate whether or not the object's finalizer has been run. As with thread locks, this data may be kept separate from the object image. Some garbage collection techniques only require this extra data while the garbage collector is actually running. A mark and sweep algorithm, for instance, could potentially use a separate bitmap for marking referenced and unreferenced objects. More detail on various garbage collection techniques, and the data that is required by each of them, is given in Chapter 9, "Garbage Collection."
In addition to data that a garbage collector uses to distinguish between reference and unreferenced objects, a garbage collector needs data to keep track of which objects on which it has already executed a finalizer. Garbage collectors must run the finalizer of any object whose class declares one before it reclaims the memory occupied by that object. The Java language specification states that a garbage collector will only execute an object's finalizer once, but allows that finalizer to "resurrect" the object: to make the object referenced again. When the object becomes unreferenced for a second time, the garbage collector must not finalize it again. Because most objects will likely not have a finalizer, and very few of those will resurrect their objects, this scenario of garbage collecting the same object twice will probably be extremely rare. As a result, the data used to keep track of objects that have already been finalized, though logically part of the data associated with an object, will likely not be part of the object representation on the heap. In most cases, garbage collectors will keep this information in a separate place. Chapter 9, "Garbage Collection," gives more information about finalization.
Array Representation
In Java, arrays are full-fledged objects. Like objects, arrays are always stored on the heap. Also like objects, implementation designers can decide how they want to represent arrays on the heap.
Arrays have a Class instance associated with their class, just like any other object. All arrays of the same dimension and type have the same class. The length of an array (or the lengths of each dimension of a multidimensional array) does not play any role in establishing the array's class. For example, an array of three ints has the same class as an array of three hundred ints. The length of an array is considered part of its instance data.
The name of an array's class has one open square bracket for each dimension plus a letter or string representing the array's type. For example, the class name for an array of ints is "[I". The class name for a three-dimensional array of bytes is "[[[B". The class name for a two-dimensional array of Objects is "[[Ljava.lang.Object". The full details of this naming convention for array classes is given in Chapter 6, "The Java Class File."
Multi-dimensional arrays are represented as arrays of arrays. A two dimensional array of ints, for example, would be represented by a one dimensional array of references to several one dimensional arrays of ints. This is shown graphically in Figure 5-8.


Figure 5-8. One possible heap representation for arrays.
The data that must be kept on the heap for each array is the array's length, the array data, and some kind of reference to the array's class data. Given a reference to an array, the virtual machine must be able to determine the array's length, to get and set its elements by index (checking to make sure the array bounds are not exceeded), and to invoke any methods declared by Object, the direct superclass of all arrays.
The Program Counter
Each thread of a running program has its own pc register, or program counter, which is created when the thread is started. The pc register is one word in size, so it can hold both a native pointer and a returnAddress. As a thread executes a Java method, the pc register contains the address of the current instruction being executed by the thread. An "address" can be a native pointer or an offset from the beginning of a method's bytecodes. If a thread is executing a native method, the value of the pc register is undefined.
Chapter 5 of Inside the Java Virtual Machine
The Java Virtual Machine
by Bill Venners
<< Page 8 of 13 >>
ADVERTISEMENT



The Java Stack
When a new thread is launched, the Java virtual machine creates a new Java stack for the thread. As mentioned earlier, a Java stack stores a thread's state in discrete frames. The Java virtual machine only performs two operations directly on Java Stacks: it pushes and pops frames.
The method that is currently being executed by a thread is the thread's current method. The stack frame for the current method is the current frame. The class in which the current method is defined is called the current class, and the current class's constant pool is the current constant pool. As it executes a method, the Java virtual machine keeps track of the current class and current constant pool. When the virtual machine encounters instructions that operate on data stored in the stack frame, it performs those operations on the current frame.
When a thread invokes a Java method, the virtual machine creates and pushes a new frame onto the thread's Java stack. This new frame then becomes the current frame. As the method executes, it uses the frame to store parameters, local variables, intermediate computations, and other data.
A method can complete in either of two ways. If a method completes by returning, it is said to have normal completion. If it completes by throwing an exception, it is said to have abrupt completion. When a method completes, whether normally or abruptly, the Java virtual machine pops and discards the method's stack frame. The frame for the previous method then becomes the current frame.
All the data on a thread's Java stack is private to that thread. There is no way for a thread to access or alter the Java stack of another thread. Because of this, you need never worry about synchronizing multi- threaded access to local variables in your Java programs. When a thread invokes a method, the method's local variables are stored in a frame on the invoking thread's Java stack. Only one thread can ever access those local variables: the thread that invoked the method.
Like the method area and heap, the Java stack and stack frames need not be contiguous in memory. Frames could be allocated on a contiguous stack, or they could be allocated on a heap, or some combination of both. The actual data structures used to represent the Java stack and stack frames is a decision of implementation designers. Implementations may allow users or programmers to specify an initial size for Java stacks, as well as a maximum or minimum size.
The Stack Frame
The stack frame has three parts: local variables, operand stack, and frame data. The sizes of the local variables and operand stack, which are measured in words, depend upon the needs of each individual method. These sizes are determined at compile time and included in the class file data for each method. The size of the frame data is implementation dependent.
When the Java virtual machine invokes a Java method, it checks the class data to determine the number of words required by the method in the local variables and operand stack. It creates a stack frame of the proper size for the method and pushes it onto the Java stack.
Local Variables
The local variables section of the Java stack frame is organized as a zero-based array of words. Instructions that use a value from the local variables section provide an index into the zero-based array. Values of type int, float, reference, and returnAddress occupy one entry in the local variables array. Values of type byte, short, and char are converted to int before being stored into the local variables. Values of type long and double occupy two consecutive entries in the array.
To refer to a long or double in the local variables, instructions provide the index of the first of the two consecutive entries occupied by the value. For example, if a long occupies array entries three and four, instructions would refer to that long by index three. All values in the local variables are word-aligned. Dual-entry longs and doubles can start at any index.
The local variables section contains a method's parameters and local variables. Compilers place the parameters into the local variable array first, in the order in which they are declared. Figure 5-9 shows the local variables section for the following two methods:
// On CD-ROM in file jvm/ex3/Example3a.java
class Example3a {

public static int runClassMethod(int i, long l, float f,
double d, Object o, byte b) {

return 0;
}

public int runInstanceMethod(char c, double d, short s,
boolean b) {

return 0;
}
}
Figure 5-11 shows three snapshots of the Java stack for a thread that invokes the addAndPrint() method. In the implementation of the Java virtual machine represented in this figure, each frame is allocated separately from a heap. To invoke the addTwoTypes() method, the addAndPrint() method first pushes an int one and double 88.88 onto its operand stack. It then invokes the addTwoTypes() method.


Figure 5-11. Allocating frames from a heap.
The instruction to invoke addTwoTypes() refers to a constant pool entry. The Java virtual machine looks up the entry and resolves it if necessary.
Note that the addAndPrint() method uses the constant pool to identify the addTwoTypes() method, even though it is part of the same class. Like references to fields and methods of other classes, references to the fields and methods of the same class are initially symbolic and must be resolved before they are used.
The resolved constant pool entry points to information in the method area about the addTwoTypes() method. The virtual machine uses this information to determine the sizes required by addTwoTypes() for the local variables and operand stack. In the class file generated by Sun's javac compiler from the JDK 1.1, addTwoTypes() requires three words in the local variables and four words in the operand stack. (As mentioned earlier, the size of the frame data portion is implementation dependent.) The virtual machine allocates enough memory for the addTwoTypes() frame from a heap. It then pops the double and int parameters (88.88 and one) from addAndPrint()'s operand stack and places them into addTwoType()'s local variable slots one and zero.
When addTwoTypes() returns, it first pushes the double return value (in this case, 89.88) onto its operand stack. The virtual machine uses the information in the frame data to locate the stack frame of the invoking method, addAndPrint(). It pushes the double return value onto addAndPrint()'s operand stack and frees the memory occupied by addTwoType()'s frame. It makes addAndPrint()'s frame current and continues executing the addAndPrint() method at the first instruction past the addTwoType() method invocation.
Figure 5-12 shows snapshots of the Java stack of a different virtual machine implementation executing the same methods. Instead of allocating each frame separately from a heap, this implementation allocates frames from a contiguous stack. This approach allows the implementation to overlap the frames of adjacent methods. The portion of the invoking method's operand stack that contains the parameters to the invoked method become the base of the invoked method's local variables. In this example, addAndPrint()'s entire operand stack becomes addTwoType()'s entire local variables section.


Figure 5-12. Allocating frames from a contiguous stack.
This approach saves memory space because the same memory is used by the calling method to store the parameters as is used by the invoked method to access the parameters. It saves time because the Java virtual machine doesn't have to spend time copying the parameter values from one frame to another.
Note that the operand stack of the current frame is always at the "top" of the Java stack. Although this may be easier to visualize in the contiguous memory implementation of Figure 5-12, it is true no matter how the Java stack is implemented. (As mentioned earlier, in all the graphical images of the stack shown in this book, the stack grows downwards. The "top" of the stack is always shown at the bottom of the picture.) Instructions that push values onto (or pop values off of) the operand stack always operate on the current frame. Thus, pushing a value onto the operand stack can be seen as pushing a value onto the top of the entire Java stack. In the remainder of this book, "pushing a value onto the stack" refers to pushing a value onto the operand stack of the current frame.
One other possible approach to implementing the Java stack is a hybrid of the two approaches shown in Figure 5-11 and Figure 5-12. A Java virtual machine implementation can allocate a chunk of contiguous memory from a heap when a thread starts. In this memory, the virtual machine can use the overlapping frames approach shown in Figure 5-12. If the stack outgrows the contiguous memory, the virtual machine can allocate another chunk of contiguous memory from the heap. It can use the separate frames approach shown in Figure 5-11 to connect the invoking method's frame sitting in the old chunk with the invoked method's frame sitting in the new chunk. Within the new chunk, it can once again use the contiguous memory approach.
Chapter 5 of Inside the Java Virtual Machine
The Java Virtual Machine
by Bill Venners
<< Page 9 of 13 >>
ADVERTISEMENT



Native Method Stacks
In addition to all the runtime data areas defined by the Java virtual machine specification and described previously, a running Java application may use other data areas created by or for native methods. When a thread invokes a native method, it enters a new world in which the structures and security restrictions of the Java virtual machine no longer hamper its freedom. A native method can likely access the runtime data areas of the virtual machine (it depends upon the native method interface), but can also do anything else it wants. It may use registers inside the native processor, allocate memory on any number of native heaps, or use any kind of stack.
Native methods are inherently implementation dependent. Implementation designers are free to decide what mechanisms they will use to enable a Java application running on their implementation to invoke native methods.
Any native method interface will use some kind of native method stack. When a thread invokes a Java method, the virtual machine creates a new frame and pushes it onto the Java stack. When a thread invokes a native method, however, that thread leaves the Java stack behind. Instead of pushing a new frame onto the thread's Java stack, the Java virtual machine will simply dynamically link to and directly invoke the native method. One way to think of it is that the Java virtual machine is dynamically extending itself with native code. It is as if the Java virtual machine implementation is just calling another (dynamically linked) method within itself, at the behest of the running Java program.
If an implementation's native method interface uses a C-linkage model, then the native method stacks are C stacks. When a C program invokes a C function, the stack operates in a certain way. The arguments to the function are pushed onto the stack in a certain order. The return value is passed back to the invoking function in a certain way. This would be the behavior of the of native method stacks in that implementation.
A native method interface will likely (once again, it is up to the designers to decide) be able to call back into the Java virtual machine and invoke a Java method. In this case, the thread leaves the native method stack and enters another Java stack.
Figure 5-13 shows a graphical depiction of a thread that invokes a native method that calls back into the virtual machine to invoke another Java method. This figure shows the full picture of what a thread can expect inside the Java virtual machine. A thread may spend its entire lifetime executing Java methods, working with frames on its Java stack. Or, it may jump back and forth between the Java stack and native method stacks.


Figure 5-13. The stack for a thread that invokes Java and native methods.
As depicted in Figure 5-13, a thread first invoked two Java methods, the second of which invoked a native method. This act caused the virtual machine to use a native method stack. In this figure, the native method stack is shown as a finite amount of contiguous memory space. Assume it is a C stack. The stack area used by each C-linkage function is shown in gray and bounded by a dashed line. The first C-linkage function, which was invoked as a native method, invoked another C-linkage function. The second C-linkage function invoked a Java method through the native method interface. This Java method invoked another Java method, which is the current method shown in the figure.
As with the other runtime memory areas, the memory they occupied by native method stacks need not be of a fixed size. It can expand and contract as needed by the running application. Implementations may allow users or programmers to specify an initial size for the method area, as well as a maximum or minimum size.
Execution Engine
At the core of any Java virtual machine implementation is its execution engine. In the Java virtual machine specification, the behavior of the execution engine is defined in terms of an instruction set. For each instruction, the specification describes in detail what an implementation should do when it encounters the instruction as it executes bytecodes, but says very little about how. As mentioned in previous chapters, implementation designers are free to decide how their implementations will execute bytecodes. Their implementations can interpret, just-in-time compile, execute natively in silicon, use a combination of these, or dream up some brand new technique.
Similar to the three senses of the term "Java virtual machine" described at the beginning of this chapter, the term "execution engine" can also be used in any of three senses: an abstract specification, a concrete implementation, or a runtime instance. The abstract specification defines the behavior of an execution engine in terms of the instruction set. Concrete implementations, which may use a variety of techniques, are either software, hardware, or a combination of both. A runtime instance of an execution engine is a thread.
Each thread of a running Java application is a distinct instance of the virtual machine's execution engine. From the beginning of its lifetime to the end, a thread is either executing bytecodes or native methods. A thread may execute bytecodes directly, by interpreting or executing natively in silicon, or indirectly, by just- in-time compiling and executing the resulting native code. A Java virtual machine implementation may use other threads invisible to the running application, such as a thread that performs garbage collection. Such threads need not be "instances" of the implementation's execution engine. All threads that belong to the running application, however, are execution engines in action.
The Instruction Set
A method's bytecode stream is a sequence of instructions for the Java virtual machine. Each instruction consists of a one-byte opcode followed by zero or more operands. The opcode indicates the operation to be performed. Operands supply extra information needed by the Java virtual machine to perform the operation specified by the opcode. The opcode itself indicates whether or not it is followed by operands, and the form the operands (if any) take. Many Java virtual machine instructions take no operands, and therefore consist only of an opcode. Depending upon the opcode, the virtual machine may refer to data stored in other areas in addition to (or instead of) operands that trail the opcode. When it executes an instruction, the virtual machine may use entries in the current constant pool, entries in the current frame's local variables, or values sitting on the top of the current frame's operand stack.
The abstract execution engine runs by executing bytecodes one instruction at a time. This process takes place for each thread (execution engine instance) of the application running in the Java virtual machine. An execution engine fetches an opcode and, if that opcode has operands, fetches the operands. It executes the action requested by the opcode and its operands, then fetches another opcode. Execution of bytecodes continues until a thread completes either by returning from its starting method or by not catching a thrown exception.
From time to time, the execution engine may encounter an instruction that requests a native method invocation. On such occasions, the execution engine will dutifully attempt to invoke that native method. When the native method returns (if it completes normally, not by throwing an exception), the execution engine will continue executing the next instruction in the bytecode stream.
One way to think of native methods, therefore, is as programmer-customized extensions to the Java virtual machine's instruction set. If an instruction requests an invocation of a native method, the execution engine invokes the native method. Running the native method is how the Java virtual machine executes the instruction. When the native method returns, the virtual machine moves on to the next instruction. If the native method completes abruptly (by throwing an exception), the virtual machine follows the same steps to handle the exception as it does when any instruction throws an exception.
Part of the job of executing an instruction is determining the next instruction to execute. An execution engine determines the next opcode to fetch in one of three ways. For many instructions, the next opcode to execute directly follows the current opcode and its operands, if any, in the bytecode stream. For some instructions, such as goto or return, the execution engine determines the next opcode as part of its execution of the current instruction. If an instruction throws an exception, the execution engine determines the next opcode to fetch by searching for an appropriate catch clause.
Several instructions can throw exceptions. The athrow instruction, for example, throws an exception explicitly. This instruction is the compiled form of the throw statement in Java source code. Every time the athrow instruction is executed, it will throw an exception. Other instructions throw exceptions only when certain conditions are encountered. For example, if the Java virtual machine discovers, to its chagrin, that the program is attempting to perform an integer divide by zero, it will throw an ArithmeticException. This can occur while executing any of four instructions--idiv, ldiv, irem, and lrem--which perform divisions or calculate remainders on ints or longs.
Each type of opcode in the Java virtual machine's instruction set has a mnemonic. In the typical assembly language style, streams of Java bytecodes can be represented by their mnemonics followed by (optional) operand values.
For an example of method's bytecode stream and mnemonics, consider the doMathForever() method of this class:
// On CD-ROM in file jvm/ex4/Act.java
class Act {

public static void doMathForever() {
int i = 0;
for (;;) {
i += 1;
i *= 2;
}
}
}
The stream of bytecodes for doMathForever() can be disassembled into mnemonics as shown next. The Java virtual machine specification does not define any official syntax for representing the mnemonics of a method's bytecodes. The code shown next illustrates the manner in which streams of bytecode mnemonics will be represented in this book. The left hand column shows the offset in bytes from the beginning of the method's bytecodes to the start of each instruction. The center column shows the instruction and any operands. The right hand column contains comments, which are preceded with a double slash, just as in Java source code.
// Bytecode stream: 03 3b 84 00 01 1a 05 68 3b a7 ff f9
// Disassembly:
// Method void doMathForever()
// Left column: offset of instruction from beginning of method
// | Center column: instruction mnemonic and any operands
// | | Right column: comment
0 iconst_0 // 03
1 istore_0 // 3b
2 iinc 0, 1 // 84 00 01
5 iload_0 // 1a
6 iconst_2 // 05
7 imul // 68
8 istore_0 // 3b
9 goto 2 // a7 ff f9
This way of representing mnemonics is very similar to the output of the javap program of Sun's Java 2 SDK. javap allows you to look at the bytecode mnemonics of the methods of any class file. Note that jump addresses are given as offsets from the beginning of the method. The goto instruction causes the virtual machine to jump to the instruction at offset two (an iinc). The actual operand in the stream is minus seven. To execute this instruction, the virtual machine adds the operand to the current contents of the pc register. The result is the address of the iinc instruction at offset two. To make the mnemonics easier to read, the operands for jump instructions are shown as if the addition has already taken place. Instead of saying "goto -7," the mnemonics say, "goto 2."
The central focus of the Java virtual machine's instruction set is the operand stack. Values are generally pushed onto the operand stack before they are used. Although the Java virtual machine has no registers for storing arbitrary values, each method has a set of local variables. The instruction set treats the local variables, in effect, as a set of registers that are referred to by indexes. Nevertheless, other than the iinc instruction, which increments a local variable directly, values stored in the local variables must be moved to the operand stack before being used.
For example, to divide one local variable by another, the virtual machine must push both onto the stack, perform the division, and then store the result back into the local variables. To move the value of an array element or object field into a local variable, the virtual machine must first push the value onto the stack, then store it into the local variable. To set an array element or object field to a value stored in a local variable, the virtual machine must follow the reverse procedure. First, it must push the value of the local variable onto the stack, then pop it off the stack and into the array element or object field on the heap.
Several goals--some conflicting--guided the design of the Java virtual machine's instruction set. These goals are basically the same as those described in Part I of this book as the motivation behind Java's entire architecture: platform independence, network mobility, and security.
The platform independence goal was a major influence in the design of the instruction set. The instruction set's stack-centered approach, described previously, was chosen over a register-centered approach to facilitate efficient implementation on architectures with few or irregular registers, such as the Intel 80X86. This feature of the instruction set--the stack-centered design--make it easier to implement the Java virtual machine on a wide variety of host architectures.
Another motivation for Java's stack-centered instruction set is that compilers usually use a stack-based architecture to pass an intermediate compiled form or the compiled program to a linker/optimizer. The Java class file, which is in many ways similar to the UNIX .o or Windows .obj file emitted by a C compiler, really represents an intermediate compiled form of a Java program. In the case of Java, the virtual machine serves as (dynamic) linker and may serve as optimizer. The stack-centered architecture of the Java virtual machine's instruction set facilitates the optimization that may be performed at run-time in conjunction with execution engines that perform just-in-time compiling or adaptive optimization.
As mentioned in Chapter 4, "Network Mobility," one major design consideration was class file compactness. Compactness is important because it facilitates speedy transmission of class files across networks. In the bytecodes stored in class files, all instructions--except two that deal with table jumping--are aligned on byte boundaries. The total number of opcodes is small enough so that opcodes occupy only one byte. This design strategy favors class file compactness possibly at the cost of some performance when the program runs. In some Java virtual machine implementations, especially those executing bytecodes in silicon, the single-byte opcode may preclude certain optimizations that could improve performance. Also, better performance may have been possible on some implementations if the bytecode streams were word-aligned instead of byte-aligned. (An implementation could always realign bytecode streams, or translate opcodes into a more efficient form as classes are loaded. Bytecodes are byte-aligned in the class file and in the specification of the abstract method area and execution engine. Concrete implementations can store the loaded bytecode streams any way they wish.)
Another goal that guided the design of the instruction set was the ability to do bytecode verification, especially all at once by a data flow analyzer. The verification capability is needed as part of Java's security framework. The ability to use a data flow analyzer on the bytecodes when they are loaded, rather than verifying each instruction as it is executed, facilitates execution speed. One way this design goal manifests itself in the instruction set is that most opcodes indicate the type they operate on.
For example, instead of simply having one instruction that pops a word from the operand stack and stores it in a local variable, the Java virtual machine's instruction set has two. One instruction, istore, pops and stores an int. The other instruction, fstore, pops and stores a float. Both of these instructions perform the exact same function when executed: they pop a word and store it. Distinguishing between popping and storing an int versus a float is important only to the verification process.
For many instructions, the virtual machine needs to know the types being operated on to know how to perform the operation. For example, the Java virtual machine supports two ways of adding two words together, yielding a one-word result. One addition treats the words as ints, the other as floats. The difference between these two instructions facilitates verification, but also tells the virtual machine whether it should perform integer or floating point arithmetic.
A few instructions operate on any type. The dup instruction, for example, duplicates the top word of a stack irrespective of its type. Some instructions, such as goto, don't operate on typed values. The majority of the instructions, however, operate on a specific type. The mnemonics for most of these "typed" instructions indicate their type by a single character prefix that starts their mnemonic. Table 5-2 shows the prefixes for the various types. A few instructions, such as arraylength or instanceof, don't include a prefix because their type is obvious. The arraylength opcode requires an array reference. The instanceof opcode requires an object reference.
Type Code Example Description
byte b baload load byte from array
short s saload load short from array
int i iaload load int from array
long l laload load long from array
char c caload load char from array
float f faload load float from array
double d daload load double from array
reference a aaload load reference from array
Table 5-2. Type prefixes of bytecode mnemonics
Values on the operand stack must be used in a manner appropriate to their type. It is illegal, for example, to push four ints, then add them as if they were two longs. It is illegal to push a float value onto the operand stack from the local variables, then store it as an int in an array on the heap. It is illegal to push a double value from an object field on the heap, then store the topmost of its two words into the local variables as an value of type reference. The strict type rules that are enforced by Java compilers must also be enforced by Java virtual machine implementations.
Implementations must also observe rules when executing instructions that perform generic stack operations independent of type. As mentioned previously, the dup instruction pushes a copy of the top word of the stack, irrespective of type. This instruction can be used on any value that occupies one word: an int, float, reference, or returnAddress. It is illegal, however, to use dup when the top of the stack contains either a long or double, the data types that occupy two consecutive operand stack locations. A long or double sitting on the top of the operand stack can be duplicated in their entirety by the dup2 instruction, which pushes a copy of the top two words onto the operand stack. The generic instructions cannot be used to split up dual-word values.
To keep the instruction set small enough to enable each opcode to be represented by a single byte, not all operations are supported on all types. Most operations are not supported for types byte, short, and char. These types are converted to int when moved from the heap or method area to the stack frame. They are operated on as ints, then converted back to byte, short, or char before being stored back into the heap or method area.
Table 5-3 shows the computation types that correspond to each storage type in the Java virtual machine. As used here, a storage type is the manner in which values of the type are represented on the heap. The storage type corresponds to the type of the variable in Java source code. A computation type is the manner in which the type is represented on the Java stack frame.
Storage Type Minimum Bits in Heap
or Method Area Computation Type Words in the
Java Stack Frame
byte 8 int 1
short 16 int 1
int 32 int 1
long 64 long 2
char 16 int 1
float 32 float 1
double 64 double 2
reference 32 reference 1
Table 5-3. Storage and computation types inside the Java virtual machine
Implementations of the Java virtual machine must in some way ensure that values are operated on by instructions appropriate to their type. They can verify bytecodes up front as part of the class verification process, on the fly as the program executes, or some combination of both. Bytecode verification is described in more detail in Chapter 7, "The Lifetime of a Type." The entire instruction set is covered in detail in Chapters 10 through 20.
Execution Techniques
Various execution techniques that may be used by an implementation--interpreting, just-in-time compiling, adaptive optimization, native execution in silicon--were described in Chapter 1, "Introduction to Java's Architecture." The main point to remember about execution techniques is that an implementation can use any technique to execute bytecodes so long as it adheres to the semantics of the Java virtual machine instruction set.
One of the most interesting -- and speedy -- execution techniques is adaptive optimization. The adaptive optimization technique, which is used by several existing Java virtual machine implementations, including Sun's Hotspot virtual machine, borrows from techniques used by earlier virtual machine implementations. The original JVMs interpreted bytecodes one at a time. Second-generation JVMs added a JIT compiler, which compiles each method to native code upon first execution, then executes the native code. Thereafter, whenever the method is called, the native code is executed. Adaptive optimizers, taking advantage of information available only at run-time, attempt to combine bytecode interpretation and compilation to native in the way that will yield optimum performance.
An adaptive optimizing virtual machine begins by interpreting all code, but it monitors the execution of that code. Most programs spend 80 to 90 percent of their time executing 10 to 20 percent of the code. By monitoring the program execution, the virtual machine can figure out which methods represent the program's "hot spot" -- the 10 to 20 percent of the code that is executed 80 to 90 percent of the time.
When the adaptive optimizing virtual machine decides that a particular method is in the hot spot, it fires off a background thread that compiles those bytecodes to native and heavily optimizes the native code. Meanwhile, the program can still execute that method by interpreting its bytecodes. Because the program isn't held up and because the virtual machine is only compiling and optimizing the "hot spot" (perhaps 10 to 20 percent of the code), the virtual machine has more time than a traditional JIT to perform optimizations.
The adaptive optimization approach yields a program in which the code that is executed 80 to 90 percent of the time is native code as heavily optimized as statically compiled C++, with a memory footprint not much bigger than a fully interpreted Java program. In other words, fast. An adaptive optimizing virtual machine can keep the old bytecodes around in case a method moves out of the hot spot. (The hot spot may move somewhat as the program executes.) If a method moves out of the hot spot, the virtual machine can discard the compiled code and revert back to interpreting that method's bytecodes.
As you may have noticed, an adaptive optimizer's approach to making Java programs run fast is similar to the approach programmers should take to improve a program's performance. An adaptive optimizing virtual machine, unlike a regular JIT compiling virtual machine, doesn't do "premature optimization." The adaptive optimizing virtual machine begins by interpreting bytecodes. As the program runs, the virtual machine "profiles" the program to find the program's "hot spot," that 10 to 20 percent of the code that gets executed 80 to 90 percent of the time. And like a good programmer, the adaptive optimizing virtual machine just focuses its optimization efforts on that time-critical code.
But there is a bit more to the adaptive optimization story. Adaptive optimizers can be tuned for the run- time characteristics of Java programs -- in particular, of "well- designed" Java programs. According to David Griswold, Hotspot manager at JavaSoft, "Java is a lot more object-oriented than C++. You can measure that; you can look at the rates of method invocations, dynamic dispatches, and such things. And the rates [for Java] are much higher than they are in C++." Now this high rate of method invocations and dynamic dispatches is especially true in a well-designed Java program, because one aspect of a well-designed Java program is highly factored, fine-grained design -- in other words, lots of compact, cohesive methods and compact, cohesive objects.
This run-time characteristic of Java programs, the high frequency of method invocations and dynamic dispatches, affects performance in two ways. First, there is an overhead associated with each dynamic dispatch. Second, and more significantly, method invocations reduce the effectiveness of compiler optimization.
Method invocations reduce the effectiveness of optimizers because optimizers don't perform well across method invocation boundaries. As a result, optimizers end up focusing on the code between method invocations. And the greater the method invocation frequency, the less code the optimizer has to work with between method invocations, and the less effective the optimization becomes.
The standard solution to this problem is inlining -- the copying of an invoked method's body directly into the body of the invoking method. Inlining eliminates method calls and gives the optimizer more code to work with. It makes possible more effective optimization at the cost of increasing the run- time memory footprint of the program.
The trouble is that inlining is harder with object-oriented languages, such as Java and C++, than with non-object-oriented languages, such as C, because object-oriented languages use dynamic dispatching. And the problem is worse in Java than in C++, because Java has a greater call frequency and a greater percentage of dynamic dispatches than C++.
A regular optimizing static compiler for a C program can inline straightforwardly because there is one function implementation for each function call. The trouble with doing inlining with object- oriented languages is that dynamic method dispatch means there may be multiple function (or method) implementation for any given function call. In other words, the JVM may have many different implementations of a method to choose from at run time, based on the class of the object on which the method is being invoked.
One solution to the problem of inlining a dynamically dispatched method call is to just inline all of the method implementations that may get selected at run-time. The trouble with this solution is that in cases where there are a lot of method implementations, the size of the optimized code can grow very large.
One advantage adaptive optimization has over static compilation is that, because it is happening at runtime, it can use information not available to a static compiler. For example, even though there may be 30 possible implementations that may get called for a particular method invocation, at run-time perhaps only two of them are ever called. The adaptive optimization approach enables only those two to be inlined, thereby minimizing the size of the optimized code.
Threads
The Java virtual machine specification defines a threading model that aims to facilitate implementation on a wide variety of architectures. One goal of the Java threading model is to enable implementation designers, where possible and appropriate, to use native threads. Alternatively, designers can implement a thread mechanism as part of their virtual machine implementation. One advantage to using native threads on a multi-processor host is that different threads of a Java application could run simultaneously on different processors.
One tradeoff of Java's threading model is that the specification of priorities is lowest-common- denominator. A Java thread can run at any one of ten priorities. Priority one is the lowest, and priority ten is the highest. If designers use native threads, they can map the ten Java priorities onto the native priorities however seems most appropriate. The Java virtual machine specification defines the behavior of threads at different priorities only by saying that all threads at the highest priority will get some CPU time. Threads at lower priorities are guaranteed to get CPU time only when all higher priority threads are blocked. Lower priority threads may get some CPU time when higher priority threads aren't blocked, but there are no guarantees.
The specification doesn't assume time-slicing between threads of different priorities, because not all architectures time-slice. (As used here, time-slicing means that all threads at all priorities will be guaranteed some CPU time, even when no threads are blocked.) Even among those architectures that do time-slice, the algorithms used to allot time slots to threads at various priorities can differ greatly.
As mentioned in Chapter 2, "Platform Independence," you must not rely on time-slicing for program correctness. You should use thread priorities only to give the Java virtual machine hints at what it should spend more time on. To coordinate the activities of multiple threads, you should use synchronization.
The thread implementation of any Java virtual machine must support two aspects of synchronization: object locking and thread wait and notify. Object locking helps keep threads from interfering with one another while working independently on shared data. Thread wait and notify helps threads to cooperate with one another while working together toward some common goal. Running applications access the Java virtual machine's locking capabilities via the instruction set, and its wait and notify capabilities via the wait(), notify(), and notifyAll() methods of class Object. For more details, see Chapter 20, "Thread Synchronization."
In the Java virtual machine Specification, the behavior of Java threads is defined in terms of variables, a main memory, and working memories. Each Java virtual machine instance has a main memory, which contains all the program's variables: instance variables of objects, components of arrays, and class variables. Each thread has a working memory, in which the thread stores "working copies" of variables it uses or assigns. Local variables and parameters, because they are private to individual threads, can be logically seen as part of either the working memory or main memory.
The Java virtual machine specification defines many rules that govern the low-level interactions of threads with main memory. For example, one rule states that all operations on primitive types, except in some cases longs and doubles, are atomic. For example, if two threads compete to write two different values to an int variable, even in the absence of synchronization, the variable will end up with one value or the other. The variable will not contain a corrupted value. In other words, one thread will win the competition and write its value to the variable first. The losing thread need not sulk, however, because it will write its value the variable second, overwriting the "winning" thread's value.
The exception to this rule is any long or double variable that is not declared volatile. Rather than being treated as a single atomic 64-bit value, such variables may be treated by some implementations as two atomic 32-bit values. Storing a non-volatile long to memory, for example, could involve two 32-bit write operations. This non- atomic treatment of longs and doubles means that two threads competing to write two different values to a long or double variable can legally yield a corrupted result.
Although implementation designers are not required to treat operations involving non-volatile longs and doubles atomically, the Java virtual machine specification encourages them to do so anyway. This non-atomic treatment of longs and doubles is an exception to the general rule that operations on primitive types are atomic. This exception is intended to facilitate efficient implementation of the threading model on processors that don't provide efficient ways to transfer 64-bit values to and from memory. In the future, this exception may be eliminated. For the time being, however, Java programmers must be sure to synchronize access to shared longs and doubles.
Fundamentally, the rules governing low-level thread behavior specify when a thread may and when it must:
1. copy values of variables from the main memory to its working memory, and
2. write values from its working memory back into the main memory.
For certain conditions, the rules specify a precise and predictable order of memory reads and writes. For other conditions, however, the rules do not specify any order. The rules are designed to enable Java programmers to build multi-threaded programs that exhibit predictable behavior, while giving implementation designers some flexibility. This flexibility enables designers of Java virtual machine implementations to take advantage of standard hardware and software techniques that can improve the performance of multi-threaded applications.
The fundamental high-level implication of all the low-level rules that govern the behavior of threads is this: If access to certain variables isn't synchronized, threads are allowed update those variables in main memory in any order. Without synchronization, your multi-threaded applications may exhibit surprising behavior on some Java virtual machine implementations. With proper use of synchronization, however, you can create multi-threaded Java applications that behave in a predictable way on any implementation of the Java virtual machine.
Native Method Interface
Java virtual machine implementations aren't required to support any particular native method interface. Some implementations may support no native method interfaces at all. Others may support several, each geared towards a different purpose.
Sun's Java Native Interface, or JNI, is geared towards portability. JNI is designed so it can be supported by any implementation of the Java virtual machine, no matter what garbage collection technique or object representation the implementation uses. This in turn enables developers to link the same (JNI compatible) native method binaries to any JNI-supporting virtual machine implementation on a particular host platform.
Implementation designers can choose to create proprietary native method interfaces in addition to, or instead of, JNI. To achieve its portability, the JNI uses a lot of indirection through pointers to pointers and pointers to functions. To obtain the ultimate in performance, designers of an implementation may decide to offer their own low-level native method interface that is tied closely to the structure of their particular implementation. Designers could also decide to offer a higher-level native method interface than JNI, such as one that brings Java objects into a component software model.
To do useful work, a native method must be able to interact to some degree with the internal state of the Java virtual machine instance. For example, a native method interface may allow native methods to do some or all of the following:
• Pass and return data
• Access instance variables or invoke methods in objects on the garbage-collected heap
• Access class variables or invoke class methods
• Accessing arrays
• Lock an object on the heap for exclusive use by the current thread
• Create new objects on the garbage-collected heap
• Load new classes
• Throw new exceptions
• Catch exceptions thrown by Java methods that the native method invoked
• Catch asynchronous exceptions thrown by the virtual machine
• Indicate to the garbage collector that it no longer needs to use a particular object
Designing a native method interface that offers these services can be complicated. The design needs to ensure that the garbage collector doesn't free any objects that are being used by native methods. If an implementation's garbage collector moves objects to keep heap fragmentation at a minimum, the native method interface design must make sure that either:
1. an object can be moved after its reference has been passed to a native method, or
2. any objects whose references have been passed to a native method are pinned until the native method returns or otherwise indicates it is done with the objects As you can see, native method interfaces are very intertwined with the inner workings of a Java virtual machine.
Chapter 5 of Inside the Java Virtual Machine
The Java Virtual Machine
by Bill Venners
<< Page 13 of 13
ADVERTISEMENT



The Real Machine
As mentioned at the beginning of this chapter, all the subsystems, runtime data areas, and internal behaviors defined by the Java virtual machine specification are abstract. Designers aren't required to organize their implementations around "real" components that map closely to the abstract components of the specification. The abstract internal components and behaviors are merely a vocabulary with which the specification defines the required external behavior of any Java virtual machine implementation.
In other words, an implementation can be anything on the inside, so long as it behaves like a Java virtual machine on the outside. Implementations must be able to recognize Java class files and must adhere to the semantics of the Java code the class files contain. But otherwise, anything goes. How bytecodes are executed, how the runtime data areas are organized, how garbage collection is accomplished, how threads are implemented, how the bootstrap class loader finds classes, what native method interfaces are supported-- these are some of the many decisions left to implementation designers.
The flexibility of the specification gives designers the freedom to tailor their implementations to fit their circumstances. In some implementations, minimizing usage of resources may be critical. In other implementations, where resources are plentiful, maximizing performance may be the one and only goal.
By clearly marking the line between the external behavior and the internal implementation of a Java virtual machine, the specification preserves compatibility among all implementations while promoting innovation. Designers are encouraged to apply their talents and creativity towards building ever-better Java virtual machines.
Eternal Math: A Simulation
The CD-ROM contains several simulation applets that serve as interactive illustrations for the material presented in this book. The applet shown in Figure 5-14 simulates a Java virtual machine executing a few bytecodes. You can run this applet by loading applets/EternalMath.html from the CD-ROM into any Java enabled web browser or applet viewer that supports JDK 1.0.
The instructions in the simulation represent the body of the doMathForever() method of class Act, shown previously in the "Instruction Set" section of this chapter. This simulation shows the local variables and operand stack of the current frame, the pc register, and the bytecodes in the method area. It also shows an optop register, which you can think of as part of the frame data of this particular implementation of the Java virtual machine. The optop register always points to one word beyond the top of the operand stack.
The applet has four buttons: Step, Reset, Run, and Stop. Each time you press the Step button, the Java virtual machine simulator will execute the instruction pointed to by the pc register. Initially, the pc register points to an iconst_0 instruction. The first time you press the Step button, therefore, the virtual machine will execute iconst_0. It will push a zero onto the stack and set the pc register to point to the next instruction to execute. Subsequent presses of the Step button will execute subsequent instructions and the pc register will lead the way. If you press the Run button, the simulation will continue with no further coaxing on your part until you press the Stop button. To start the simulation over, press the Reset button.
The value of each register (pc and optop) is shown two ways. The contents of each register, an integer offset from the beginning of either the method's bytecodes or the operand stack, is shown in an edit box. Also, a small arrow (either "pc>" or "optop>") indicates the location contained in the register.
In the simulation the operand stack is shown growing down the panel (up in memory offsets) as words are pushed onto it. The top of the stack recedes back up the panel as words are popped from it.
The doMathForever() method has only one local variable, i, which sits at array position zero. The first two instructions, iconst_0 and istore_0 initialize the local variable to zero. The next instruction, iinc, increments i by one. This instruction implements the i += 1 statement from doMathForever(). The next instruction, iload_0, pushes the value of the local variable onto the operand stack. iconst_2 pushes an int 2 onto the operand stack. imul pops the top two ints from the operand stack, multiplies them, and pushes the result. The istore_0 instruction pops the result of the multiply and puts it into the local variable. The previous four instructions implement the i *= 2 statement from doMathForever(). The last instruction, goto, sends the program counter back to the iinc instruction. The goto implements the for (;;) loop of doMathForever().
With enough patience and clicks of the Step button (or a long enough run of the Run button), you can get an arithmetic overflow. When the Java virtual machine encounters such a condition, it just truncates, as is shown by this simulation. It does not throw any exceptions.
For each step of the simulation, a panel at the bottom of the applet contains an explanation of what the next instruction will do. Happy clicking.


Figure 5-14. The Eternal Math applet.
On the CD-ROM
The CD-ROM contains the source code examples from this chapter in the jvm directory. The Eternal Math applet is contained in a web page on the CD-ROM in file applets/EternalMath.html. The source code for this applet is found alongside its class files, in the applets/JVMSimulators and applets/JVMSimulators/COM/artima/jvmsim directories.
The Resources Page
For links to more information about the Java virtual machine, visit the resources page: http://www.artima.com/insidejvm/resources/