dotgnu-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [DotGNU]Could someone test this on all platforms ?


From: Rhys Weatherley
Subject: Re: [DotGNU]Could someone test this on all platforms ?
Date: Mon, 10 Feb 2003 00:20:27 +1000
User-agent: KMail/1.4.3

On Sunday 09 February 2003 09:58 pm, Paolo Molaro wrote:

> A quick google search turned up only pages that said the caffeinemark
> source code is not available. Do you have an url handy?

The source is not available.  However, there are plenty of Java decompilers 
and disassemblers listed on Freshmeat.  That is what I used to determine what 
the CaffeineMark was doing.

> It seems caffeinemark has 9 or 11 tests, depending on the version, while
> pnetmark has only 5, so, even assuming caffeinemark is a valid
> benchmark, pnetmark is in no way comparable to it.

The other benchmarks relate to graphics, which weren't relevant at the time 
PNetMark was written.  It arguably still isn't relevant - which toolkit would 
we test?  SWF?  Gtk#?  Qt#?  Wildly differing results can be expected with 
different toolkits.  Better to put graphical tests in a different bucket.

> Yes, some of the
> tests may require stuff not implemented by pnet (or mono, for that
> matter), but this simply means that you can't compare pnetmark with
> caffeinemark.

Comparing PNetMark/CLR to CaffeineMark/JVM scores will probably give bogus 
results, that's true.  That doesn't negate the value of using a similar 
algorithm.

> There is one other thing that turns out in the pnetmark benchmarks: none
> of them check the execution engine executed the code correctly: I think
> this is a requirement for a benchmark, a score is of no use if the code
> was executed incorrectly.

As I said before: feel free to submit patches if you detect a discrepancy.

> I never said it was slanted towards pnet, for all I know a proper
> benchmark could give better results for pnet. The point is that,
> for example, the Magnification values need a rationale.

Go ask the CaffeineMark folks - I copied the magnification values directly 
from their code.  Presumably they had some justification for the values that 
they chose.  I certainly don't know what it is - I've been aware for quite 
some time that fudging the magnifications can be used to fudge the results, 
but I declined to play such tricks.

> If someone does a research on a representative corpus of code for the
> CLR and he comes up with mostly the same magnification factor I'll have
> no problems accepting the value. As things stand right now, there is no
> evidence that this research has been done.

Just today I ported the SciMark and Linpack benchmarks from Java to C# by 
doing nothing more than a syntax fixup (update coming soon).  Are you now 
going to insist that I write a PhD thesis to justify the results that are 
produced?  That's not my job.

Once again, if you detect a discrepancy in the port, then feel free to patch 
it.  But demanding that I provide proof of something that you know full well 
that I cannot provide is a bit much.

> Almost on the same topic, consider the string benchmark: it
> only calls three functions:
>       StringBuilder::Append (string)
>       StringBuilder::ToString ()
>       String::IndexOf (string, int)
>
> I doubt anyone would consider those three functions representative
> for the performance of the CLR on strings.

*sigh* I ported the CaffeineMark.  I never claimed that it was an exhaustive 
test of all CLR capabilities.  The CaffeineMark is hardly an exhaustive test 
of all JVM capabilities.  Feel free to submit new benchmarks if you feel that 
the current ones are incomplete or unfair.

In any case, the String benchmark is not really testing strings at all.  It is 
testing memory allocation.  StringBuilder::Append hammers the garbage 
collector mercilessly, so the final result reflects allocation overhead 
instead of compute overhead.

I will be porting other benchmarks in the coming weeks.  Feel free to point 
out porting discrepancies.  But questions as to testing methodology and 
benchmark validity should be forwarded to the original authors.

Cheers,

Rhys.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]