Performance testing

While developing it (and introducing small changes) I was continuously testing the performance of this 4 implementations. It was very disappointing that every time I run it I get different results. Unfortunately my PC does many things (Skype, Steam, Virus scan) and every test is ran in a little bit different conditions.
I could run those test for long time and do averages but averages also have some disadvantages - one poor run can ruin the whole result.
I decided run tests for long time (actually 2 days continuous work) and take best results. Assuming that in ideal world I should run performance tests on machine doing nothing else (100% CPU usage) taking best results is as close as I can imagine and one poor run (when Skype decided to take 70% of CPU for no reason) won't spoil the whole test.
Some 'bests' were a little bit surprising though, so I used 'averages' to confirm them.

Note: LZ4 itself comes it 2 flavours - 32 and 64 bit. One was supposed to be run in x86 and the other in x64 architecture. But I decided to implement both for both architectures. And I'm glad I did (see below).

So finally we have 4 different approaches (Mixed, C++/CLI, Unsafe, Safe), all of them in two flavours (32 and 64-bit) and all of them can be run on x86 and x64 giving 16 different combinations.

compare-max-nolz4sharp.png

As picture is worth 1024 words:

Compression

compare-max-encode.png

Things you probably expected:
  • Mixed Mode: is the best (32bit/x86 and 64bit/x64)
  • Mixed Mode: is a little bit better in 64bit mode (64bit/x64)
  • Safe: is the slowest one (no kidding)
Although, there are things I was surprised with:
  • Mixed Mode: (not a surprise to me, but requires explanation) Big dip on 64bit/x86 is probably due to the fact that BitScanForward64 is not available on x86 so it falls back to De Brujin
  • C++/CLI: I don't know why C++/CLI is slower than my C# Unsafe but it is (all the way across the graph) (did I improve an algorithm a little or C++/CLI compiler is much worse then C# compiler?)
  • Safe: 32bit/x64 is (surprisingly) faster than 64bit/x64 (so after switching platform to 64bit it's worth to stay with 32bit algorithm)

Decompression

compare-max-decode.png

This is something what I wasn't expecting. Every single algorithm behaves a little bit different.
  • Mixed Mode: 64bit/x64 is actually a little bit slower than 32bit/x86
  • C++/CLI: this time C++/CLI is better than Unsafe when it matters (32bit/x86 and 64bit/x64)
  • Unsafe: 32bit/x64 beats 64bit/x64 and 64bit/x86 beats 32bit/x86 (again: on 64 bit platform 32 bit algorithm is better, on 32 bit platform 64 bit algorithm is better - that's an anomaly)
  • Safe: on both platforms (x86 and x64) 64 bit decompression is faster than its 32-bit counterpart (so whatever platform you are on use 64 bit decompressor)

I confirmed all the anomalies with average values:
compare-avg.png


Last edited Jan 29, 2013 at 10:23 AM by Krashan, version 5

Comments

No comments yet.