Dates and times comment
May 6th, 2008 by Peter Scott
A few days ago, Farooq posted some comments on an old post of mine. The final comment in our exchange concerned storing activities in our data warehouse by time (actual hour and minute). This is quite a common requirement, especially where the performance of customer service agents is being measured (as in Farooq’s question)
I would be tempted to split the event date and time into two separate dimensions a ‘date’ dimension with a granularity of day and a ‘time’ dimension that has sufficient resolution to capture the events of interest; there is no date component in the time dimension, for example events that occurred at 09:18:00 on May 6 2008 will share the same time dimension key as events that occurred at 09:18:00 on February 5 2008. This approach keeps the time dimension compact, we only need to key the possible time values that occur in a single day, and more importantly it opens up the possibility of simple time range queries aggregating over many days.

May 6th, 2008 at 11:17 am
I think that I’d be tempted to go for three columns for that — a date-without-time dimension, a time-without-date dimension (I wonder how that ought to be stored? Seconds since midnight?), and a degenerate date+time dimension (no separate dimension table). If you have a situation where you want to find/count/analyze all of the events within a certain time range (eg. 09:18:00 on February 5 2008 plus or minus 24 hours) you have some awkward hoops to jump through without that third element.
May 6th, 2008 at 12:19 pm
We normally model the time-only dimension (without date) as seconds since midnight…it seems to be the most usual thing to do since this ways you have unique IDs which follow more or less the same logics of date-only unique ID creation.
May 6th, 2008 at 3:59 pm
@David
I’d might go with that - The original post had an “updateable” fact - we would have columns for various status change date / times (so we can simply “subtract” two columns to get say, the interval between call open and call close) - and a degenerate dimensions for each of three or four status columns would certainly allow simple math… lots of columns on the fact table, but hey, disk is cheap!
May 7th, 2008 at 8:48 am
I’d suggest a read of Ralph Kimball’s design tip 51. He suggests a date-time stamp in the fact table rather than a separate time dimension. Seems to be a pragmatic approach.
http://www.kimballgroup.com/html/designtipsPDF/KimballDT51LatestThinking.pdf
May 7th, 2008 at 9:58 am
@Simon
A good link - and one that is in effect the same as David’s comment.
It is also good to go to the bottom of design tip where they mention that a time dimension [no date component]is very useful for organisations with a concept of time banding (which hours are busiest with our customer service agents etc)
May 8th, 2008 at 8:49 am
Peter,
fully agree with you. Apart from memory considerations and time range queries, I see two advantages over Kimball’s approach: a separate minute/second based time dimension eases handling of summer time changes where you might have double entries between 2 and 3 o’clock. And, you might implement levels like quarter hour, hour or day=total in this dimension thus enabling the load of preaggregated data.
May 9th, 2008 at 8:14 am
@Markus
Can you please elaborate on how the Time dimension would aid Summer Time change, thanks.
@Peter
I agree, IF you have the requirement for banding or other descriptive text about time then a time dimension is useful.
May 14th, 2008 at 12:19 pm
Just an FYI… Kimball used to recommend the seconds since midnight approach… but he changed over time to the TIMESTAMP column in the fact approach. I prefer the later… in my experience, end users have always hated the seconds since midnight approach, but the caveat concerning banding is a reasonable requirement that may require the older approach. The latest version of the Data Warehouse Lifecycle Toolkit explains the reasons for changing his default approach.
May 28th, 2008 at 9:03 pm
Thank you Peter for bringing this topic.
I implemented date-time in fact table. When I saw another article on “What Not to Do by Kimball” I thought my approach is not 100% perfect.
Mistake 12: Place text attributes in a fact table if you mean to use them as the basis of constraining and grouping (by Kimball)
http://www.intelligententerprise.com//011024/416warehouse1_1.jhtml