Over the last year, I've been speaking at conferences on one subject more than any others: Agile Data Warehousing with Exadata and OBIEE. Although I've been busy with client work and growing the US business, I realize I need to dedicate more time to blogging again, and this seemed like the logical subject to take up. So I'll use the next few blog posts to make my case for what I like to call Extreme BI: an Agile approach to data warehousing using the combination of Extreme Performance and Extreme Metadata.
In a standard data warehouse implementation, whether we are walking in the Inmon or Kimball camps, some portion of our data model will be dimensional in nature; a star schema with facts and dimensions. So let me pose a question, which I think will lend itself well to diving into the Extreme BI discussion: Why do we build dimensional models? The first reason is simplicity. We want to model our reporting structures in a way that makes sense to the business user. The standard OLTP data model that takes two of the four walls in the conference room to display is just never going to make sense to your average business user. At the end of a logical modeling exercise, I expect the end-user to have a look at a completed dimensional model and say: "Yep... that's our business alright". The second reason we build dimensional models is for performance. Denormalizing highly complex transactional models into simplified star schemas generally produces tremendous performance gains.
So my follow-up question: can the combination of Exadata and OBIEE, or Extreme BI, actually change the way we deliver projects? We've all seen the Exadata performance numbers that Oracle publishes, and I can tell you first hand the performance is impressive. Can this Extreme Performance combined with the Extreme Metadata that OBIEE provides give us a more compelling case for delivering data warehouses using Agile methodologies?
To start with, I'd like to paint a picture of what the typical waterfall data warehousing project looks like. The tasks we usually have to complete, in order, are the following:
- User interviews
- Construct requirement documents
- Create logical data model
- SQL prototyping of source transactional models
- Document source-to-target mappings
- ETL development
- Front-end development (analyses and dashboards)
- Performance tuning
- Iteration 1: Interviews and user requirements
- Iteration 2: Logical modeling
- Iteration 3: ETL Development
- Iteration 4: Front-end development
To apply the Agile Manifesto to data warehouse delivery, it's the following key elements that are required for us to deliver with a true Agile spirit:
- User stories instead of requirements documents: a user asks for particular content through a narrative process, and includes in that story whatever process they currently use to generate that content.
- Time-boxed iterations: iterations always have a standard length, and we choose one or more user stories to complete in that iteration.
- Rework is part of the game: there aren't any missed requirements... only those that haven't been addressed yet.