Oracle Partitioning 2007

I see that Oracle have posted a series of white papers on the soon to arrive 11g release, the one on partitioning caught my eye.

But first an aside: Does anyone actually use slower storage for less-accessed data? The theory sounds good: put the-once-in-a-blue-moon stuff on slower, cheaper, disk and the "use all of the time" data on the high grade, expensive, disk or even solid-state disk... but in practice do people just slap the data on to their 100TB corporate SAN? When's the last time you saw someone buying a couple of hundred GB JBOD array of SATA disk when they already the space on a corporate NAS /SAN?

I have always liked Oracle's implementation of partitioning, it is flexible, and transparent to queries - that is we don't need a special syntax to query partitioned data. I could do most things that I wanted to in the 9.2 and 10g releases but some things required a little creativity. But this time around there will be some nice features - and some should even appeal to the OLTP community. Significantly the number of options for partition key has significantly increased and for composite partitioning (sub partitions) almost any combination goes - I am not sure that HASH-HASH would be useful feature to many people though. New partitioning scheme types are:

  • Interval partitions, that is an auto-extending variant on range partitions (if new data arrives for a non-existing partition the partition is created). This could be a great feature for people new to partitioning and have not already developed code to create partitions as part of data load process; however if you are using some form of rolling data design in a data warehouse you will still need to write code to drop aged-out partitions
  • Ref partitions. Here we are partitioning on the partitioning strategy inherited from a parent table (as referenced by a foreign key join) This may not be quite as useful as it sounds in a DW situation where items can change parents (type 1 slow changing dimensions) but could be good in OLTP work where we have master / detail table pairs (such as order header / order line) and could partition both tables on order_date without having to denormalise the detail table to contain the order date just to act as partition key.
  • Virtual Column partitioning allows functions to be used on column data to define the partitions. For example some organisations generate codes with some form of semantic meaning - say the first three characters of a product code could be the supplier code, we could then partition our product containing table on supplier by creating a virtual partition key on the first three characters of product code
Second aside: I am not a great lover of hash partitions. Partitioning should be a tool to improve both manageability and performance, but to my thinking hash partitions are not about boosting performance in DW systems where we have many 'multi partition key' queries and end up hitting almost all of the hash partitions, although we might still get partition-wise joins and some parallelism gains we just don't get the partition pruning