Not such good ideas

One of our consultants dropped by my desk to ask me some Oracle questions around data life cycle management. One of his customers is moving a store based system to a central system and like a lot of systems in the UK needs to keep 7 years of transactional data available. I explained all of the standard stuff: partitioning for ease of maintenance, use of lower spec disk (slower) for the old stuff you don't need often, table compression, indexing techniques.

He came back a couple days later to run past me the ideas from the customer's external consultant. The consultant proposed that the data was partitioned by geographic region. Although this idea sounds like it has merit, after all most queries will concern single stores, it rang alarm bells in my mind.

  • There is a need to keep a rolling 7 year history, old data needs to be removed, new added. Maintaining partitions through delete and insert operations is not efficient. Dropping the oldest partition and adding a new one is very efficient, especially if we partition exchange to populate
  • Not using time as part of the partition key prevents access to features such as readonly tablespaces
  • Although each store is today allocated to a region, who's to say that next month the business won't reorganise and drop a couple of regions and reallocate the stores affected to the remaining regions. It is not sensible let data move between partitions. Ideally, a partition key should be immutable and region clearly is not
It would seem that the consultant took the positive attributes of partitioning: ease of DBA management, improvement of ETL process and enhanced query performance through partition elimination and decided that the first two were unimportant. I think that manageability and batch is very important and query performance might be better addressed through indexing. But if partition elimination is desirable based on store I would go with either list or hash sub-partitions on store_id, and leave the region well alone