Agile Data Warehousing with Exadata and OBIEE: Introduction
December 21st, 2011 by Stewart Bryson
Over the last year, I’ve been speaking at conferences on one subject more than any others: Agile Data Warehousing with Exadata and OBIEE. Although I’ve been busy with client work and growing the US business, I realize I need to dedicate more time to blogging again, and this seemed like the logical subject to take up. So I’ll use the next few blog posts to make my case for what I like to call Extreme BI: an Agile approach to data warehousing using the combination of Extreme Performance and Extreme Metadata.
In a standard data warehouse implementation, whether we are walking in the Inmon or Kimball camps, some portion of our data model will be dimensional in nature; a star schema with facts and dimensions. So let me pose a question, which I think will lend itself well to diving into the Extreme BI discussion: Why do we build dimensional models? The first reason is simplicity. We want to model our reporting structures in a way that makes sense to the business user. The standard OLTP data model that takes two of the four walls in the conference room to display is just never going to make sense to your average business user. At the end of a logical modeling exercise, I expect the end-user to have a look at a completed dimensional model and say: “Yep… that’s our business alright”. The second reason we build dimensional models is for performance. Denormalizing highly complex transactional models into simplified star schemas generally produces tremendous performance gains.
So my follow-up question: can the combination of Exadata and OBIEE, or Extreme BI, actually change the way we deliver projects? We’ve all seen the Exadata performance numbers that Oracle publishes, and I can tell you first hand the performance is impressive. Can this Extreme Performance combined with the Extreme Metadata that OBIEE provides give us a more compelling case for delivering data warehouses using Agile methodologies?
To start with, I’d like to paint a picture of what the typical waterfall data warehousing project looks like. The tasks we usually have to complete, in order, are the following:
- User interviews
- Construct requirement documents
- Create logical data model
- SQL prototyping of source transactional models
- Document source-to-target mappings
- ETL development
- Front-end development (analyses and dashboards)
- Performance tuning
Raise your hand if this looks familiar. We would have to go through all these steps, which could take months, before end users can see the fruits of our labor. To mitigate this scenario, organizations will attempt to deliver data warehouses using “Agile” methodologies. What this usually means, from my experience, is a simple repackaging of the same waterfall project plan into “iterations” or “sprints”, so that the project can be delivered iteratively. So the process might look like the following:
- Iteration 1: Interviews and user requirements
- Iteration 2: Logical modeling
- Iteration 3: ETL Development
- Iteration 4: Front-end development
But this, ladies and gentlemen, is not Agile. To get an understanding of what lies at the heart of Agile development, we need to look no further than the Agile Manifesto, or the history of the Agile Movement. When examining the different methodologies, there is one major theme that permeates all of them: working software delivered iteratively. It’s not enough to simply deliver the same old waterfall methodology in “sprints” or “iterations”, because, at the end of those iterations, we don’t have any working software… software that end users can actually use to improve their job or help them make better decisions. In the example above, we still require four iterations before we get any usable content. It doesn’t matter if we’ve written some complex ETL to load a fact table if the end user doesn’t have a working dashboard to go along with it.
To apply the Agile Manifesto to data warehouse delivery, it’s the following key elements that are required for us to deliver with a true Agile spirit:
- User stories instead of requirements documents: a user asks for particular content through a narrative process, and includes in that story whatever process they currently use to generate that content.
- Time-boxed iterations: iterations always have a standard length, and we choose one or more user stories to complete in that iteration.
- Rework is part of the game: there aren’t any missed requirements… only those that haven’t been addressed yet.
I’ve been conscious not to prescribe any distinct Agile methodology, though I can’t help using more Scrum-like concepts in this formulation. However, I think this list is generic enough to apply to most methodologies. Over the next few posts, I’ll discuss the necessary puzzle pieces to engage in Extreme BI, as well as how we might implement new subject area content in a single iteration. Additionally, I’ll discuss how these implementations might be reworked, or “refactored”, over several iterations to produce data warehouses that respond to user stories: what users want and when they want it.
Follow-up Posts
Agile Data Warehousing with Exadata and OBIEE: Puzzle Pieces
Agile Data Warehousing with Exadata and OBIEE: Model-Driven Iteration
Agile Data Warehousing with Exadata and OBIEE: ETL Iteration


December 25th, 2011 at 12:00 am
Hey Stewart
It will be great to see how you will apply agile methodology for BI projects in early phases. I have used to apply those kind of methodologies in my projects ( LIKE ASD, Scrum …etc)but frankly i used them after designing the DW (designing the dashboards and creating the reports phase ).
if you want to use agile methodologies when you designing the DW .. i think this will require more efforts and a lot of rework,because in most cases the customer don’t know what exactly what he needs, imagine if you design the DW and perform your ETL and generate some reports (reports are your deliverables ), then the customer asked you to add something to you Star schema … you have to redesign and perform ETL again ..and this will effect your schedule right ..and this will not end her, after each iteration he may ask some changes .. and you can’t say no because those changes are in the scope of the project ..
we will see how you can handle those things in your next blogs :)
December 25th, 2011 at 3:32 pm
@Besher: Keep reading. You’ll see my approach. :-)
December 30th, 2011 at 7:18 pm
I’m currently woking on a project whereby we are delivering a transaction system using OWB and managing the development with SRUM, Agile, standups.
It hasn’t worked, its a bit of a mess to be honest.
This isn’t due to the Agile element but more due to the standard of the staff hiding behind processes.
These days when I hear managers trumpeting new processes I know they probably don’t know what they are doing.
I’m sure your different considering the organisation you work for but in general the new buzz word bingo process just maintains the same management for a further fiasco.
Tim
January 2nd, 2012 at 7:48 pm
@Tim
Agile is more than just a methodology, or set of processes… it’s a mindset. The old mindset of CYA (cover you assets) with documentation and finger-pointing when things don’t work has to stop. As technologists, we have to partner with the business and say “these are our user stories” and “these are our solutions”.
Agile is not right for every organization. When each division or group is still predominately concerned with turf wars, then Agile is the wrong choice.
June 15th, 2012 at 2:03 pm
Hi,
Very interesting approach, but is the danger of too much modelling within the Oracle app layer turning the business rules into a black-box, inaccessable by other BI tools or consumers? Also, what happens if I want to bypass the semantic layer and write SQL for an ad hoc query – I need an indepth understanding of the 3NF schema. Is there another blog entry to cover these off?
Thanks