Microsoft Outlook 2003 “Compact” mail feature – perfect example of poor algorithm

Presumably “compacting” is the way to reduce fragmentation of the database, drop unused blocks and thus avoid extra seeks, that’s a very good intention, but why such a poor implementation? It’s over 30 mins now and my Outlook is still “compacting” mailbox with just 1 GB size: low CPU usage but lots and lots of disk seeks, which is the reason for such appallingly slow performance. This means that the algorithm used for this feature is very poor and should have never been used on a poor random access IO subsystem such as magnetic hard disks.

Our index building process deals with tens of terabytes of compressed data so we designed it to be parallel, and also to avoid unnecessary disk seeks. It wasn’t easy, but defragmenting small mailbox is a trivial matter that does not need advanced algorithms, yet it appears that whoever implemented that function in Microsoft did not know what they were doing – this compacting should have run in under 1 minute max on my system, I can’t use email while it runs, so this function MUST be quick because otherwise it wastes valuable business time.

There is also indication of progress during this operation. It seems obvious that any non-instant processing job should indicate amount of work done and ETA, so that the user can decide if they want to cancel it or wait a bit longer.

By the time I finished this post it was still running, so I cancelled it and won’t use this feature again.

Leave a Reply