Data Warehouse Healthchecks and OWB training

June 5th, 2007 by Jon Mead

Rather strangely as Mark jets off to Greece for a weeks holiday, I am also off there for a few days work (he’s here next week working as well).

Looking back over the first few months of the company a lot of the work we have been doing has been either Data Warehouse healthchecks or OWB training.

The healthchecks have been for a number of different companies, each with a different Data Warehouse set up. A packaged Data Warehouse solution that was neither loading or querying in a performant way, another Data Warehouse designed to integrate data from a number of different subsidiaries, each with different ERP systems holding their data in different structures, others just not hitting the mark with load and query times.

The problems are generally:

  • Not loading in the required timeframe
  • There is no quality in the confidence of the data
  • Query response time is too slow

All these factors combine to mean the business is not able to answer the questions it wants from the Data Warehouse.

The process we run through needs to be tailored for each organisation, unfortunately there is no ‘one size fits all’ approach to this, however one of the objectives over the next few months is to try and formalise this process and create/publish a methodology or approach we use for this. This would go further than a database tuning exercise, and would encompass areas like:

  • Verification/understanding of the business requirements (do
    they want and need a Data Warehouse?)
  • Review the data model (is there a design for this piece of
    ’software’?)
  • Review the ETL process (will it really load in 3 hours, do
    they really mean real-time?)
  • Data quality (data what?)
  • Review the database (do the Data Warehouse features used
    approximate to 7.3.4 or 10gR2?)
  • Aggregation strategy (how much work is the database doing
    to answer aggregated queries)

We would then take this information, ruminate, and prepare a report and/or a plan as what could be done next.

So what have we found - well, there may be a number of reasons why things may not be well on the good ship Data Warehouse. Warehouses tend to grow organically, responding to user requirements, organisational changes and merger/acquisitions when needed.
This can dilute the design, stretch the ETL process and invalidate some of the fundamental design decisions taken about the Warehouse. The use of standardised design patterns and a high level data model can help rectify this. Although sometimes seen as an over-simplification, it can provide a basis for putting the Data Warehouse back on track. A high level design can work as a medium that all the stakeholders can work round.

Another important issue is how the business works with the IT team. The development can be led from either the IT department or the business. If it lead from an IT side there is often an understanding of the kinds of requirements it is likely to meet, but there is also an aspect of the Field of Dreams ‘build it and they will come’. More often the requirements are driven by the business. This makes it easier to establish business
sponsorship and ownership, and to be honest more likely to succeed. One of the problems is establishing expectations for business input, its not just a case of them providing some requirements and then walking away. One method that has proved helpful is
to use contract-like documents such a Acceptance Criteria or Service Level Agreement to set expectations across the whole project.

The bottom line is Data Warehouse projects must be run in the same way as other Software Development projects. If there are no clear requirements the project may well drift and a lot rework could be required. If there is no strong configuration management
process used then you can’t be sure of what is being release, and when. If no design is employed, you cannot guarantee the user requirements are met, or that your ETL or query processes are efficient. If there is insufficient testing then users will not have confidence in the data. If you follow standard Software Development techniques, however traditional or agile, you will have a better chance of success.

So what next? As I mentioned above we are looking to further formalise the process and enhance our methodology. I am also submitting a couple of papers for the UKOUG, one talking around some of the Data Warehousing issues mentioned above and another focusing of some OWB best practices and advanced techniques.

So to conclude, does anyone know why you can’t use laptops and iPods on aeroplanes during take-off and landing? Will it make them crash, should we be worried, are they not telling us something?

Comments

  1. Anon Says:

    Surely if iPods and laptops were really a problem (i.e. caused the plane to crash) they’d be banned from hand luggage, on the basis that statistically, someone at some time will leave theirs on, and cause the plane to crash? And based on that same probability, surely there’d have been a crash?

    And again, surely it wouldn’t be too difficult a task to run some tests to establish whether there is in fact an issue? I would have thought the test would be straightforward:

    1. Get in plane
    2. Turn on ipod
    3. Take off
    4. Check if plane crashed.

    Seriously, given how many logical holes there are in the airline’s argument, why do they still persist with this? That’s the real mystery.

  2. Brighton Werewolves « Pete-s random notes Says:

    […] of Brighton, Jon Mead writes a few notes on data warehouse healthchecks. I completely agree with his approach - it is the same […]

  3. Jean-Pierre Dijcks Says:

    Well, do you watch mythbusters? They tried the cell phone thing and tried to figure out what was going on. They did I think measure something… I think they concluded minimal issues, but they did call out the social aspect… imagine the guy who screams in his cell phone at 33000 feet in a tiny metal tube with no way to shut him up? Or the iPod that is so loud I can’t hear the movie? Ah well, enough about that, I guess we are still doing “better safe than sorry” and I guess I like it that way…

  4. Matt Topper Says:

    The real reason for them making you have your ipod and computer stored away during takeoff and landing is because those are the most dangerous times on a plane. If an engine fails during take off and you land in the water they don’t want to have to try and yell at you over your music to hear the instructions. Seconds do matter in those situations. If you are flying above 10,000 (the reason they say its 10 minutes before take off and landing) they let you use your devices because either you have a lot of time to put things away and prepare for an emergency landing (i.e. someone on board is having a baby or heart attack) or the wings blew off and you might as well enjoy your last minutes with your ipod before falling to Earth.

    Thats at least the explanation one of my pilot friends gave me. It does make sense though.

  5. Jon Mead Says:

    JP - Agree with phone calls and possibly iPods, but I would like to sort out my (albeit offline) inbox during take-off and landing

    Matt - that seems the most sensible answer I have heard, although it does in itself raise a few questions: should they also ban passengers from sleeping during this period, should they allow laptops with internet access so you can Google ‘What to do if the wings fall off…’