| simplas2002 (22) | |
|
Well, I tried Ruby (1.9.2), Vala, C, C++, Fortran 95 (Gfortran and Ifort), Java and C# (Mono and .Net) with the DGEMM implementations (indices correctly established for each case) for matrices up to 2500x2500. C,C++ and Fortran came **clearly** on top from around 1000x1000 up. Fortran is **still the king** with around 10% faster runs. Vala also reaches the top after setting some specific flags. Java, C# are around 2-3x slower and Ruby is orders of magnitude slower. This is also the conclusion of our friends at http://shootout.alioth.debian.org/u32/which-programming-languages-are-fastest.php For high-performance computing compiled languages are obviously better. Ruby **is** convenient as long as there is no need to debug, the code is small and doesn't perform any relevant calculations. Basically, C++ is the universal language. | |
|
Last edited on
|
|
| rapidcoder (681) | |
|
There are three kinds of lies: a lie, a big lie, and a computer language shootout benchmark. Seriously, benchmark with no context, no sourcecode, no information how it was run and measured is just a piece of rubbish. I can tell you, I saw C code that after being translated to Ruby ran faster than the original version. Without quoting the source code would you believe me? BTW: The shootout benchmarks are very poor quality, and they benchmark only one, single JVM and .NET implementation, one C++ compiler implementation, which are all not known as "the fastest you can get". E.g. here someone rerun those benchmarks on Excelsior JET: http://www.stefankrause.net/wp/?p=9. Java and GCC were equal here within 5% in most benchmarks, and in one of them JET 6.4 significantly outperformed GCC. | |
|
Last edited on
|
|
| simplas2002 (22) | |||||||||||
|
Let me be direct: Unless you can show some "wonder implementation", up to this point in time Java doesn't stand a chance. With the matrix multiplication algorithm without "tricks" like skipping operations, you cannot make a Java implementation faster than C++ or Fortran for matrices above 1000x1000. Java appears to be close to C and C++ but only for very small size matrices However, IFF you are actually calling BLAS or a similar library behind the scenes, then you can make those claims. Python fans say the same, with a LAPACK/BLAS (either ATLAS, GotoBLAS or MKL) library doing the actual work. It is often very clear. Commands and options: gfortran (gcc 4.5) -O3 ifort (ifort 10.1) -O3 mcs (mono 2.8) -optimize+ .net (VS 2010 Express) optimize flag ruby (1.9.2) (no options) java - did it in eclipse months ago and doesn't stand a chance. Fortran95:
Csharp:
C++
Vala:
Ruby (forgot where I put my last code... so it is actually ripped-off and I suppose it is not really optimized):
Java: please insert yours. | |||||||||||
|
Last edited on
|
|||||||||||
| rapidcoder (681) | |
|
So you are benchmarking multiplication of zero-filled matrices and can't give exact compiler options. The -O3 option for GCC is also not the best you can set for this kind of benchmark. Holy cow! And you think we should trust YOU and not. e.g. scientists from IBM Research who benchmarked BLAS Java multiplication and found it is 90% of performance of optimised Fortran[1]? You seem to be a troll, sir. :D [1] http://www.research.ibm.com/ninja/ | |
|
Last edited on
|
|
| chrisname (5896) | |
| I was just thinking, it doesn't really seem to be a fair test. And even it it was, it doesn't prove one is faster than the other, just that one is faster than the other in some specific case. | |
|
Last edited on
|
|
| rapidcoder (681) | |||||||
|
@chrisname: exactly! Anyway, just out of curiosity. Here it is:
Hardware: Intel Core 2 Duo 2.2 GHz T6670, DDR2 @ 800 MHz. GCC: gcc.exe (TDM-2 mingw32) 4.4.1 Copyright (C) 2009 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Java: java version "1.6.0_22" Java(TM) SE Runtime Environment (build 1.6.0_22-b04) Java HotSpot(TM) Client VM (build 17.1-b03, mixed mode, sharing) GCC options: -O3 (left as in original benchmark, although I think it can be improved) Java options: -server -XX:+AggressiveOpts Results: Java code output:
C++ code output (manually executed five times under code::blocks):
Don't tell simplas2002 where the "hack" is, if you see this :D If I use the same optimisation in the C++ code, then the both versions are of exactly the same speed (~ 3%). So no, simplas2002, you have to think better to find a case where C++ outperforms Java by a factor of 2-3. Matrix multiplication is definitely not this case. | |||||||
|
Last edited on
|
|||||||
| simplas2002 (22) | |||||
|
Dear RapidCoder: 1) Concerning BLAS, I mentioned before, it is different, since if you are using GotoBLAS, for example, in any language you can obtain extraordinary speed. And you can wrap it for Ruby. 2) Test of JAVA and C++ (gcc 4.6.0): C++ 42% faster (using JAVA as denominator- see below) 3) Have you read the report from IBM? It has (FIG.11) plain java with 21.4 MFLOPS and Fortran with 119.6 MFLOPS. How do you know who I am? Your Java Code:
The previous C++ code:
Machine: macbookpro 2.66 I can send you a printscreen if you want. | |||||
|
Last edited on
|
|||||
| rapidcoder (681) | |
|
Still flawed: 1. An experimental, alpha quality version of GCC compared to a stable version of Java. Compare to Java 7 or some beta of Excelsior JET. How can you assume that GCC created even correct code? You don't check it. Knowing the history of GCC, it is quite probable. 2. Java for Mac is not oficially supported. 3. Still not even a 2x difference (= negligible in most applications). 4. Such microbenchmark shows only that for that particular pointless task optimisations in this particular Java HotSpot VM for that particular architecture are not up to the optimisations in that particular experimental GCC[1]. So its practical value is near 0. Just like most shootout benchmarks. On a different architecture (Intel Core 2 Duo), for which Java is officially supported and optimised, Java wins your benchmark by more than 25%. Which means that Java-the-lanugage is not the limiting factor. The limiting factor is the compiler and its optimisation set. There is no compiler optimisation done in C++ that could not be done in the Java compiler. On the other hand there are lots of optimisations that cannot be done in a C++ compiler, and can be (some are) employed in Java. The final conclusion is: Java, C# C, C++ are in the same league of performance. All are compiled, all are statically typed. Performance differences are just too small (and much more dependent on circumstances), to be a reason for choosing one over another. [1] Assuming the good performance result of GCC is not caused by a bug in GCC causing the generated code to do not what it is expected. BTW: Another reason why Java can be considered the same performance league as C++ are some high-performance-computing contests. E.g. this one: http://sortbenchmark.org/ | |
|
Last edited on
|
|
| helios (10126) | ||||
*Though I suspect you'd gladly accept such a benchmark if it gave Java the edge. | ||||
|
|
||||
| rapidcoder (681) | |||||
For the creator of the compiler - yes, it is not negligible. It is probably a bug in one of the compilers (some optimisations not applied correctly). For someone who decides which language to pick and for the final user of the product - except if it is an AAA game, it is negligible. Most of the C/C++ software out there could be significantly sped up by better algorithms, or hand coded assembly, etc. But almost nobody cares, except some CS students that have just picked C or C++ and want to show how smart they are. I've yet to see some software (in any language) that really uses all the power of my computer. What I really care for, is that there are no suboptimalities of magnitude 1000 times, e.g. like there are in the recent OpenOffice release (written in C++), or hangups like in recent Firefox (if the download list is too long - it freezes for seconds, ofc, also in C++), or crashes like in KDE, or delays like in Eclipse 3.6 (caused not by GC, but by sloppy coding). It just has to have acceptable performance, not maximum performance.
I've already shown the exactly same code running on Java@Core2Duo significantly faster. But no, I would not accept is as evidence. It would show only that there is a still room for improvement in the compiler technology, on both sides, but particularly in VMs, which are younger than static compilers. But these are compiler implementations, not languages. Excelsior shows that Java can be compiled statically to as efficient code as C++ can be. (Actually I wanted to say 64 bit Java on MacBook is performance-wise crappy and that is all - they have to work harder. But GCC several years ago was also crappy and produced extremely suboptimal machine code - yet no-one claimed they have to write in assembly instead of C++, because of that. See: http://stackoverflow.com/questions/1834607/64-bit-java-vm-runs-app-10x-slower).
The only evidence he produced is that some alpha version of a C++ compiler outperformed Java on some niche hardware platform. But on the most popular hardware/OS platform, providing his suboptimalities in code are corrected, there is a tie. On the other hand, I can show you some niche hardware platforms, where you cannot write C++ code (because there is no native API and publicly availabale compiler), but Java runs perfectly. Would it be any evidence that "Java is better"? No.
First define what does it mean "C++ is faster?". If it means that the best C++ compilers produce sometimes better code than some not-so-good Java VMs, then yes, I agree. However, the opposite is also true. Want to compare VC++ with Excelsior JET? | |||||
|
Last edited on
|
|||||
| helios (10126) | ||||||
| ||||||
|
|
||||||
| rapidcoder (681) | ||||
I've meant hardware + software. Java on Mac is not supported and 64 bit desktop is still a niche. Ok, hardware is capable of 64 bits since very long, yet most people in my neighbourhood use 32-bit OSes. 64 bit Java is optimised mostly for Linux and Solaris, for big iron servers. No-one uses PowerBooks for such applications, and 32-bit is still sufficient for Desktop, even if one installs a memory hog Vista (oh, yes, also C and C++ - you see, the programs don't get automatically faster and snappier only because they were written in these languages - the much more important factors are programmers, time and budget).
I have nowhere said it was a C++ fault. Seems, you read just what you want to read, and not what is really written. Most software has bugs and is suboptimal - because of poor programming, not the platform. Just analyse the differences between various implementations of the same benchmark in the great language shootout. These things affect overall performance much more than a 70% better compiler. If they compiled OpenOffice with a 70% better compiler, the problem would not go away. It would hang for a second instead of two whenever I edit text in a text frame in Impress - still unacceptable and irritating. In this case, discussing if one compiler can get you a 2x speedup or slowdown is an academic dispute.
I'm not invalidating __the benchmark__. I'm invalidating the conclusions he has drawn from his benchmark. Notice, that I'm not claiming Java to be 30% faster than C++ as it has come from my benchmark. I've seen lots of such benchmarks and most of them, when made by both good Java and C++ programmers, resulted in performance differences < 20% in __both__ directions. Especially when it comes to large practical applications, not some artificial microbenchmarks doing nothing. So any statements that Java/C# and other languages using similar execution model are not suitable for high performance applications are simply an uninformed FUD. BTW: Something that produces wrong machine code sometimes is alpha-quality: http://archives.free.net.ph/message/20100705.062720.5d776623.pt-BR.html | ||||
|
Last edited on
|
||||
| taylorc8 (40) | |||
|
Programming is fun, do it all the time. Truthfully in my experience I've never used an application that I thought performed exceptionally well on my hardware--except for two video games, the first being Assassin's Creed II, and the second being Mass Effect 2. Both astounded me by how smooth they ran. That's my opinion, and I'm tired of these crap benchmarks. One last thing, in this code:
Am I mistaken, or is it actually acceptable to leak memory for a benchmark? | |||
|
|
|||
| simplas2002 (22) | |||
|
Yes, I agree it was a *niche system*. And I also agree that GCC is not the absolute best C++ compiler available. With this in mind I found out that "...Java always initializes arrays when you create them..". That of course explains why the performance, although worse than C++, it is still better than expected. In Linux you're right, results are worse. However, in C++ we can also use Standard arrays and initialize them... Please see the C++ code similar to JAVA:
Let us now try in a less-niche system and 2 c++ compilers for the following system: Intel Core(TM)2 Duo CPU T9600 @ 2.80 GHz Linux Ubuntu 10.10 Maverick Kernel Linux 2.6.35-23-generic-pae g++: real 0m25.125s user 0m24.818s sys 0m0.288s gcc (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5 Copyright (C) 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. icc: real 0m22.100s user 0m21.989s sys 0m0.092s icc (ICC) 11.1 20100414 Copyright (C) 1985-2010 Intel Corporation. All rights reserved. JAVA: Elapsed: 28.087s Elapsed: 27.169s Elapsed: 28.4s Elapsed: 27.781s Elapsed: 28.048s java version "1.6.0_20" OpenJDK Runtime Environment (IcedTea6 1.9.1) (6b20-1.9.1-1ubuntu3) OpenJDK Server VM (build 17.0-b16, mixed mode) The conclusion is that JAVA is slower than C++, either using gcc 4.6, gcc 4.4.5, icc 10.1, Linux or Mac. **For this specific test** | |||
|
Last edited on
|
|||
| Grey Wolf (3172) | |
|
OMG! Please shut the fuck up! Edit:If you want to continue this futile discussion, take it to the Lounge If you want a real challenge for your speed tests, implement a deterministic algorithm for primality and test the following number: 1645742825183467619487091143114742813111578265693364754500359 0347255103912018989569978272825902654731502101121217503326338 9300898343647261648503777405076868923505466907255695313192782 3038824414275347022148320732189275923 | |
|
Last edited on
|
|
| simplas2002 (22) | |
|
Not a real challenge. Not prime and it took 1 sec.: PrimeQ[1645742825183467619487091143114742813111578265693364754500359 0347255103912018989569978272825902654731502101121217503326338 9300898343647261648503777405076868923505466907255695313192782 3038824414275347022148320732189275923] false | |
|
|
|
| Grey Wolf (3172) | |
| Hmm...I don't see that you have implemented a deterministic algorithm for primality, all I see is a dubious use of Mathematica. | |
|
|
|
| Bazzy (6281) | |||
| |||
|
|
|||
| Grey Wolf (3172) | |
| :0) | |
|
|
|
| rapidcoder (681) | ||
|
Nice. BTW:
Array initialization in this microbenchmark is negligible, because it is O(n^2), while the multiplication algorithm is O(n^3). Additionally the statement is not generally true. Some Java compilers optimize array initialization away, if they are sure it is safe. Sun's JVM probably not (it could do much better in many cases). The performance differences in this code may probably come from: - loop unrolling - vectorisation (SIMD instructions) - array bound checking (JVM has to "think" hard here to eliminate it - C++ compiler throws this responsibility at the programmer) Finally, when you made more checks (but still just single JVM - is a flaw in this test, you should certainly check Excelsior JET), seems that Java is not 2-3x slower for this test, but about 25% slower than icc and 12% slower than gcc 4.5. | ||
|
Last edited on
|
||