Table compression

As I mentioned yesterday I was in London (in fact in the shadows of Westminster Abbey and the Houses of Parliament) today for the UK OUG BIRT SIG meeting. The venue was also shared by a UK OUG directors meeting so I got the opportunity to chat to Peter Robson over lunch. Oddly, we ended up talking about table compression... so that triggered the thought to write this.

Firstly, I am not going to write about the new 11g feature of Advanced Compression, party because I don't use Linux often and so do not have a suitable 11g box to play with, and partly because I see it as a mainly OLTP orientated feature - that perception may change when I start to play with the feature. I must also keep in the back of my mind that Advanced Compression is an added cost feature.

Oracle table compression does not work the same way as index compression: index compression is for multiple column indexes where repeating leading edge values are effectively replaced with the database equivalent of a ditto mark; on the other hand table compression is at database block level and reoccurring values in a block are replaced by tokens and a "look up" created in the block header. In a table block any reoccurring value is potentially compressible, it is not just the columns that form part of a key.

One of the main restrictions with compression is that you compress using a direct path write, that is CREATE TABLE AS, INSERT with the APPEND hint or one of the ALTER TABLE MOVE commands. That's not to say that a conventional insert or update can not be used on a compressed table, it can, but the row(s) involved won't be compressed.

But disk is cheap, so why bother compress? Well disk may be, but IO bandwidth isn't and what we are about here is moving as much data from disk as possible, the overhead in CPU time needed to unpack the data being less than than the saving from fetching fewer blocks in a table scan. It may be a different story for those whose main access path are indexed reads to single rows.

A final thought, with sensible data ordering it is quite possible to save substantial space on the disk (and mileage may vary here) I have had five-fold size reductions and have heard of 7-fold.