Tuesday, April 15, 2008

Java is faster than C++


Here is a graph comparing java and C++ performance on various algorithms. It can be seen that java is faster in most cases.
Though these benchmarks show that java is faster, still java programs appear to be slow.
This is mainly because large applications are deployed as jar files, and when loading the application the java classloader as to extract the classes from the compressed jar files and then do verification and the load the class. Once a class is loaded it will actually be faster than C code.

Compile java code directly to native code and run with out a JRE

Yes, you heard it right. With a new compiler called GCJ you can compile your java code directly to native code that can be run directly on the underlying machine.
GCJ comes with native implementations of the standard java 1.4 libraries and some of the java 1.5 libraries.
GCJ is still in the development stage and will improve in future.
You can checkout GCJ at http://gcc.gnu.org/java/

Personally, I don't why the GCJ developers are wasting time on this because study shows that Java code is as fast as or exceeds the naative c code performance in many different tests.

Saturday, March 22, 2008

Data Mining and Machine Learning

Data mining is nothing but extracting useful and meaningful data from a huge collection of data.
Many organizations collect large volumes of data relating to their business. This data is not organized and has lot of information. This data is usually collected the day to day business activities of the organization.
Lot of meaningful information can be extracted from this data like, The sales growth pattern, the demographic pattern etc...
There are lot of techniques manual and automatic for extracting meaningful information from this data. These techniques are collectively called data mining. The manual technique involves an expert in the concerned domain and a good data mining software that can be used to drill down into the data. Automated techniques involve usage of some machine learning tools to automatically identify meaningful patterms in data and then apply these patterns to predic future data movement.