Exa-ctly

Sitting about half-a-flying-day from San Francisco and OOW and for that matter my colleagues, gives me opportunity to mull over Larry's keynote address that has been so widely reported on-line.

I would love to say my prediction for what was going to be announced was completely right; although in the main it was right some of my conclusions on how it might be done were a little adrift.

For strange people like me, people that see the world as moving large amounts of data around, it was exciting news. For me, data retrieval and storage are bulk processes and need to be achieved in way that does not swamp the capacity of that weak link, IO bandwidth. Historically, we had the tools of right-sizing (that is making sure we have the "correct" number and size of disks, the right number of controllers, the right disk connect technology etc) and reducing the data volume to transfer (through pre-aggregated summary tables, partitioning, indexing and, now, table compression) The missing component, and one used by some DW appliance vendors such as Netezza, is pushing parts of the query out to the data. In effect this a technique to reduce the amount of information to be moved a distance which is good for both IO and ultimately processing power - it is a lot easier to sort and manipulate a small data set. I say "moved a distance" as inevitably the same amount of data has to be read off disk and processed on the storage units, it is just that unneeded data does not migrate along a potentially slower link only to be discarded at the database. In a way this is not unlike the childhood card game "Happy Families"; I ask the storage unit "Do you have Mr Bunn, the Baker?" and get given Mr Bunn, and not be given all the cards and be told to look for myself.

Of course there will be a lot of questions that will come up such as: does it work with star transformations (I would guess,yes as the results of the bitmap combines can theoretically be pushed to the storage units), does it support Oracle OLAP... maybe not, but there again is that important if we can materialize a relational cube do something similar?

For now I am excited, but with anything that is hardware based, saddened that I can't just download it and give it a whirl.