C vs. Java in number crunching
Introduction
Java being the most popular programming language of application programming is usually thought to be a good (usually the best) choice for any task. Six or seven months ago one of my instructor Dr. Hoermann, who is a real expert in random number generation techniques, ask me to reimplement one of his earlier C projects in Java by using the merits of object oriented design. I won’t dwell into the details of the project but shortly it was an object oriented implementation of a set of universal random number generators.
Initially, I did not suspect about the performance of Java because I previously use some data mining tools, simulators implemented by Java especially by some universities. Such that Weka – one of the most popular Java data mining applications—implemented by New Zealand University . So I start to design a model. You know those things. Interfaces, classes, so and so for…
Finally, design and implementation were completed and it seems that our new generator was solely doing pretty well. I use the generators to estimate some stock options and to do some
Since the generator I implemented requires too much theory of statistics and probability theory, I won’t deal with it. But rather I will be implementing a pretty simple PI estimator using
Some rough idea on Monte Carlo
O! many a shaft at random sent
Finds mark at the archer little meant!
And many a word, at random spoken.
May soothe or wound a heart that’s broken!
Sir Walter Scott, the Lord of the Isles
A
Believe or not this will be in proportion with the square of your umbrellas’ radius. Are the umbrellas magic? What if some guy squirt water on you by using garden trunk on you while your umbrellas are in your hand? This time the ratio of droplets will not be in proportion with r square. Why? Umbrellas, water, kid, parent are all the same. So what is the problem? The key word in here is uniformity. Think some you will realize what the problem isJ
So this much mix-up in your mind about
Performance
If you look close enough, you will see that as we multiply the number of replications by 4, the CI shrinks only by half.
All those in mind, let me show you the performance of Java and C with respect to number of replications (rainy days):
From now on I will not talk any more but one thing. Notice that the response time graphics of Java and C are both linear. This is good that our algorithm is scaleable. But isn’t it nice that C’s response time slope is one eighth of the Java. J You may ask or better may search for the reasons of this result: Intermediate machine code, virtual machine, overhead of class hierarchy, garbage collector…
That’s all for today... See you in the future!
Source Codes
Java
import java.util.Random;
import java.math.*;
public class MonteCarlo {
static final int iter = 100000;
static final int max_rep = 5000;
/*I aware of the fact that such public variable declarations are bad for OO design.But for the sake of performance…*/
public static int in_region[];
public static Random r;
public static void EstimatePI(int rep) {
double x, y;
int i, j;
double mean = 0, var = 0, hl;
for (j = 0; j < rep; j++) {
in_region[j]=0;
for (i = 0; i < iter; i++) {
x = 2 * r.nextDouble() - 1;
y = 2 * r.nextDouble() - 1;
if (x * x + y * y <= 1) {
in_region[j]++;
}
}
mean += in_region[j] * 1.0 / rep;
}
if (rep > 1) {
for (i = 0; i < rep; i++) {
var += (in_region[i] - mean) * (in_region[i] - mean) / (rep - 1);
}
}
mean = 4 * mean / iter;
hl = 7.84 * Math.sqrt(var/rep) / iter;
System.out.println("PI estimation is " + mean + " with half-length " + hl);
}
public static void main(String[] args) {
MonteCarlo mc = new MonteCarlo();
mc.r=new Random();
mc.in_region = new int[mc.max_rep];
long start;
for (int r = 1; r <= mc.max_rep; r *= 4) {
start = System.currentTimeMillis();
MonteCarlo.EstimatePI(r);
System.out.println("Response time is " + (System.currentTimeMillis() - start)/1000.0 + " sec.");
System.out.println();
}
}
}
C
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>
#define ITER 100000
#define MAX_REP 5000
int in_region[MAX_REP];
void EstimatePI(int rep){
int i,j;
double mean=0,var=0,hl,x,y;
srand(time(NULL));
for (j = 0; j < rep; j++) {
in_region[j]=0;
for (i = 0; i < ITER; i++) {
x = 2.0 * rand()/RAND_MAX - 1;
y = 2.0 * rand()/RAND_MAX - 1;
if (x * x + y * y <= 1) {
in_region[j]++;
}
}
mean += in_region[j] * 1.0 / rep;
}
if (rep > 1) {
for (i = 0; i < rep; i++) {
var += (in_region[i] - mean) * (in_region[i] - mean) / (rep - 1);
}
}
mean = 4 * mean / ITER;
hl = 7.84 * sqrt(var/rep) / ITER;
printf("PI estimation is %7.5lf with half-length %7.5lf\n",mean,hl);
}
int main(int argc, char* argv[]){
int r;
unsigned int start;
srand(time(NULL));
for (r = 1; r <= MAX_REP; r *= 4) {
start = time(NULL);
EstimatePI(r);
printf("Response time is %d sec.\n\n",time(NULL) - start);
}
return 0;
}
2 Comments:
akademik bir çalışma olmuş bu :) Ama C nin Java 'ya bu anlamda üstünlüğünü hemen herkes kabül ediyor fakat geliştirme kolaylığı, jvm ile ortam bağımzıszlığı ve nesneye yönelim özellikleri nedeniyle kebapçı geliştiriciler java yı tercih ediyorlar. öncelik performans olduğunda c ve c++ alternatifsiz, bu yüzden ki compute intensive işler söz konusun olduğunda plsql native compile ediyoruz ve tcell provisioning api lerini atos c++ ile yazıyor gibi.
yakışır bir başlangıç, devamını bekliyorum özellikle performans çalışmaların benim ilgimi çok çekiyor :)
Best answers to those has been given by Mr. Stroustrup:
Much of the relative simplicity of Java is - like for most new languages - partly an illusion and partly a function of its incompleteness. As time passes, Java will grow significantly in size and complexity. It will double or triple in size and grow implementation-dependent extensions or libraries. That is the way every commercially successful language has developed. Just look at any language you consider successful on a large scale. I know of no exceptions, and there are good reasons for this phenomenon. [I wrote this before 2000; now see a preview of Java 1.5.]
Java isn't platform independent; it is a platform. Like Windows, it is a proprietary commercial platform. That is, you can write programs for Windows/Intel or Java/JVM, and in each case you are writing code for a platform owned by a single corporation and tweaked for the commercial benefit of that corporation. It has been pointed out that you can write programs in any language for the JVM and associated operating systems facilities. However, the JVM, etc., are heavily biased in favor of Java. It is nowhere near being a general reasonably language-neutral VM/OS.
For more please refer to http://www.research.att.com/~bs/bs_faq.html
Post a Comment
<< Home