Added dimensions

Some dimensions in a data warehouse are easy to understand… for a retail DW, stock might be composed of three dimensions: when, where and what; sales on the other hand, might also have a customer dimension (if customers are known, say, in a loyalty scheme or by delivery address) It would seem hard to come up with any more dimensions, but sometimes we can.

Take ‘product’. Now that’s simple… or is it?

A product can have many attributes: pack size, storage (frozen, ambient, etc), shelf-life, supplier, buyer, and product brand to name but a few. We could index some of these attributes and come up with queries to find total sales for June 1, 2005 for all products supplied by Acme Products. But in some data warehouses these attributes become dimensions in their own right. One data warehouse I manage has 4 product attributes that have been implemented as dimensions. At first this seems overkill, we could simply make a bitmap join index and query against supplier instead of product code for example. But this would loose the flexibility in roll-up that the implementing supplier as a dimension gives us. Take a real supplier, say Procter and Gamble. They produce a large number of products: laundry detergents, cosmetics, hygiene products, snack foods etc. Many, many products across a large number of market sectors. In a DW it would be quite reasonable to roll-up product to product category (say washing powders) and compare the sales of the products supplied by one supplier against those of another. Furthermore it could be reasonable to introduce a hierarchy to supplier. A company that makes pet food and chocolate bars might supply them through different companies. It would be quite reasonable to have a ‘master company’ tier in the supplier hierarchy that enabled buyers to negotiate ever larger discounts. Without implementing supplier as a dimension such queries are forced to access the lowest common level of data.

The fact that users want to simultaneously query product and supplier data and for different levels of roll-up is a compelling reason to convert a product attribute to a dimension in its own right.