Too much of a good thing

There is a saying that you can't have too much of a good thing. I have my doubts; you can have too much chocolate, cream, beer, sun, or whatever. Perhaps a surfeit of something is a bad thing and as 'no longer good' the saying still holds true.

In data warehouses, having too many summaries is also a bad thing. The point of a summary is to reduce the I/O and computation needed to retrieve a result. But having summaries has two costs: space to store, and effort to maintain. If a summary is not used enough its existence is a waste.

Sometimes it is better not to have a summary that answers a high percentage of queries directly but to choose one that is slightly less aggregated, the extra work to complete the aggregation is often minimal and the lower level aggregate can potentially answer a lot more queries than the one that targeted 57% of queries. This is especially so for dimensional hierarchies that do not greatly reduce the number of members between levels; if you record sales by channel and you have two channels, web and store then not much effort is required to calculate total sales for the company from adding the two channel totals. In fact it is quite possible that both records would have been read in the same IO.

You can also have too many indexes but that is another story...