Archive for June, 2008

Stay away from float/double in C#

Monday, June 30th, 2008

I stayed away from floating point variables for almost 2 decades now, mainly on the grounds of performance as back in the old days not every CPU even had a dedicated floating point unit in it, but today I decided to use floats in a couple of places and quickly an odd issue came up, here is the test case:

float fGross=1.18f;
float fTax=0.18f;

Console.WriteLine(“Gross: {0}, Tax: {1}, Net of tax: {2}”,fGross,fTax,fGross-fTax);
Nothing fussy here, two fairly small numbers, the only operation we do is substraction so no rounding issues should happen, we only use 2 digits after the dot so the float should handle it, right? Wrong. The printout on screen will be:

Gross: 1.18, Tax: 0.18, Net of tax: 0.9999999

rather than expected result of “Net of tax: 1″.

Using double variables with correct assignment of double precision values: 1.18d (not 1.18f – something that will result in more calculation errors even for initialised types) seems to work (for this test case), however this example shows that staying away from floats is not a bad idea – only use them unless you absolutely have to and if you do then don’t trust the results. Next time I use these data types will be in the next decade if not later.
Some relevant discussion on this topic is here.

Improving .NET garbage collection on multi-core setups: gcserver option

Saturday, June 7th, 2008

These days programmers have to deal with setups that contain multiple cores so coding in a way that takes advantage of extra parallel processors is becoming matter of life and death. At Majestic-12 we have a small framework that allows to parallelise long running tasks but from time to time we run into strange things one of which happened again today. Our application was processing data on 8 cores, about 8 TB of data in fact so high IO can be expected to make processors wait for data to be crunched however it turned out that application was running slower than it should be – disk IO could not be to blame. Have a look at CPU usage history below:

CPU usage with default settings in a .NET application

Roughly CPU usage was about 60-63%, this is well below what it should have been. To cut long story short it turned out that adding the following configuration option to .NET application configuration file helped:

According to Microsoft gcServer can help in multi-CPU setups, and it does – just have a look at CPU usage of the same application when it was enabled:

CPU usage with gcserver set to true

This pretty much got CPU usage closer to 93-95% per CPU, which is about where it should have been in the first place when allowing for a fair amount of disk reads.

But what exactly happens here? It appears that default mode of garbage collection would stop ALL threads while collecting garbage, this was effectively pausing processing on multiple cores. You can see another option above – gcConcurrent - this in theory should make garbage collection even faster, however I found its usage is buggy in .NET 2.0, so I’d recomment to be very careful when enabling this option – I keep it turned off.

Finally since I started talking about memory stuff in .NET and garbage collection the biggest lesson of all to learn was to reuse memory – this is the best way to avoid overheads in parallel processing as well as save on actual memory usage – this is the key to high performance processing in .NET (and really any other language that uses garbage collection).