Added dimensions

June 8th, 2005 by Peter Scott

Some dimensions in a data warehouse are easy to understand… for a retail DW, stock might be composed of three dimensions: when, where and what; sales on the other hand, might also have a customer dimension (if customers are known, say, in a loyalty scheme or by delivery address) It would seem hard to come up with any more dimensions, but sometimes we can.

Take ‘product’. Now that’s simple… or is it?

A product can have many attributes: pack size, storage (frozen, ambient, etc), shelf-life, supplier, buyer, and product brand to name but a few. We could index some of these attributes and come up with queries to find total sales for June 1, 2005 for all products supplied by Acme Products. But in some data warehouses these attributes become dimensions in their own right. One data warehouse I manage has 4 product attributes that have been implemented as dimensions. At first this seems overkill, we could simply make a bitmap join index and query against supplier instead of product code for example. But this would loose the flexibility in roll-up that the implementing supplier as a dimension gives us. Take a real supplier, say Procter and Gamble. They produce a large number of products: laundry detergents, cosmetics, hygiene products, snack foods etc. Many, many products across a large number of market sectors. In a DW it would be quite reasonable to roll-up product to product category (say washing powders) and compare the sales of the products supplied by one supplier against those of another. Furthermore it could be reasonable to introduce a hierarchy to supplier. A company that makes pet food and chocolate bars might supply them through different companies. It would be quite reasonable to have a ‘master company’ tier in the supplier hierarchy that enabled buyers to negotiate ever larger discounts. Without implementing supplier as a dimension such queries are forced to access the lowest common level of data.

The fact that users want to simultaneously query product and supplier data and for different levels of roll-up is a compelling reason to convert a product attribute to a dimension in its own right.

Comments

  1. Mark Says:

    Pete, just thought I’d let you know that I read your postings on dimensional modelling, and I think they’re pretty good actually. Just incase you thought you might be talking to yourself.

    BTW - speaking of dimensional modelling, have you ever come across Ralph Kimball’s design tips section - http://www.ralphkimball.com/html/designtips.html

    Some good stuff on taking the dimensional model and extending it to accomodate additional forms of analysis, or unusual source data, whilst still keeping to the kimball ethos.

    cheers

    Mark

  2. Pete_S Says:

    Thanks, Mark.
    Statcounter tells me that people find my blog, but it can’t tell me opinions. It is good to get feedback on the content (and postive feedback is even better!)

    BTW, did you see the stuff I posted in May on dimensions?

    I will take a look a the URL you gave me