Smoke Testing with OWB

June 22nd, 2007 by Jon Mead

I have been putting together some thoughts on Agile Development techniques for a paper for the BIWA event later in the year. I have seen a couple of articles on Agile development notably on the Amis blog, here, in the Oracle space, so thought I would add my tuppence worth.

One of the principles of Agile Development is continuous and/or frequent delivery. In order to do this you must have the ability to test frequently, in order for this process not to take all of your time it really should be automated. Smoke testing is a way of achieving this. I first heard the term from reading Steve McConnell (very good author, excellent read about software engineering and projects), and first used it with OWB when I was working with a very good Project Manger, Gerry Williams, enabling us to develop a very portable and robust ETL suite. Steve McConnell’s definition, full version here is as follows:

Every file is compiled, linked, and combined into an executable program every day, and the program is then put through a “smoke test,” a relatively simple check to see whether the product “smokes” when it runs.

Why do we want to do this? It seems like a lot of overhead and possibly a lot of unnecessary work. ETL code is typically not executed frequently enough during the development process, especially with realistic sets of data. This can lead to a number of problems both in terms of the quality or robustness of the code and can lead to a whole load of data quality issues. Something I read recently (sorry can’t remember where from) says that 50% of ETL projects overrun due to data quality issues. If we are only seeing the data we are loading once the system is live, then there is a high risk of problems. So the answer is that we want our code to work and keep on working once it goes live, admittedly no great revelation, but you’d be surprised…

So how can we use this in OWB? First we need to develop within a framework. We need to execute the code before it is all built. We need to start with the shell of our ETL process and gradually fill in the blanks. One of the ideas of Smoke Testing is Don’t Break the Build. There is nothing worse than a project team coding and coding and coding to meet deadlines and then when all the code gets put together then it takes another week to integrate it. I will write another posting detailing how to put such a framework together in OWB, but it is based on using a number of process flows executing template mappings, gradually developing these mappings and providing a mechanism to selectively execute various areas of the process.

Second we need an automated way to promote and execute the code we are building on a daily basis. What we are looking to do is replicate Ant for OWB. This means automation and scripting and is centred on OMBPlus. We can use a script that literally destroys (cleans) the existing environment, recreates the repository including the latest delta of the OWB components, the target users, deploys the code, grant permissions and executes the ETL process. If we build from scratch each time then we protect ourselves from any kind of teeth (read permission) problems when promoting the code. We know that when we install the code in a new environment it will work.

Once we have these components we can automate the whole process. For example developers can put any competed work into a predetermined collection. This is then automatically promoted, built and executed every night. By the time you come back into work the next morning you will know a lot about your ETL code and the data it has run against.

This can be used as a regression testing tool, or to examine data quality – the business can be involved in the process with a review of the data every morning, or to just ensure the quality of the code. The real payoff is the automation, so once the framework exists there no extra effort required to meet this principle of continuous software delivery.

Comments

  1. Norbert Says:

    Hi Mark,
    great what you wrote, I still have the same problems in my project
    and also I have think about this way to work.
    The first skripting runs but its hard to do this job under pressure in ab project. So I will look forward for your next posting.
    Regards
    Norbert

  2. Tim Berry Says:

    Sorry Jon but there is a major error on your posting and far be it for me to complain but…

    Smoke testing was a major overhead at [customer name removed - Mark] and needs careful consideration before being used in anger. The frequency of smoke testing should be a compromise between new development and data releases and new deliveries required.
    At [customer name removed - Mark] the daily smoke tests did cause major headaches for the project as they seemed to be assumed by the project management as more ETL processing being released ready for the next data release. When they were really just system unit tests.

    If you can see them as system unit tests it is a good technique for early integration testing – but it is no more than that.

    Oh and I think you’ll know what the obvious error is!
    Tim

  3. Jon Mead Says:

    Tim,

    I take your point about smoke tests being used as system unit tests, this did create more problems as they were seen as an external deliverable. However during subsequent phase s of the project they were further automated and remained ‘internal’ which relieved the problem,

    Jon

  4. Jon Mead Says:

    Norbert,

    The scripting is definitely an overhead, but can be re-used between other OWB projects, but even without the smoke testing aspect it makes a good framework to develop round as it forces you to do a lot of the design up front,

    Jon

  5. Tim Berry Says:

    that was quick. I have a way forward utilising decision tables within this migration project to direct development towards a set based approach. They have been in the cursor based insert world for too long!

  6. Gerald Williams Says:

    Tim – ‘When they were really just system unit tests’ is correct. The issue at the customer was that the Programme Management were trying to overlap many phases to such an extent that the system tests were also being ‘hijacked’ for data production – not a good idea when the team were heavily into data mapping and ‘figuring it out’. The bigger picture had a lot of politics associated etc etc

    However when it came to the point of ‘World – the time has come to push the button’ (to steal a Chemical Brothers line) the data migration was slick.

    Smoke Tests are a superb tool for system testing, but other projects jumping on their output too early (data in this case) is not wise

  7. Irfan Rasul Says:

    I think this is a good description of the process and benefit of smoke testing, if I have any comments they would really be the following and I think you may be addressing these in a subsequent posting so sorry if it is at a lower level than required.

    I would agree with Norberts point that this is quite a large overhead up front, we benefited from building on the existing components for our project, this probably disguised how much effort overall was put into it’s creation. I guess if you were doing this from scratch, you would need dedicated resources and time in the plan to get the framework up and running. Something, that may cause some PM’s with time/budget constraints to use it as a ‘contingency’ for other slippages and therefore slowly erode away the budget.

    As for, whether it is worth the effort, I would say a definite yes, the benefits we gained from the smoke test saved a huge amount of fix/maintenance time which would have crept into the project this rework time would probably have no time & budget in the plan. It is difficult to say how much time we saved, or whether it balances in relation to the cost of development, next time it would be useful to keep metrics of time spent on building the framework. The hardest part of this is justification, how do we quantify how much value it adds?

    I also, think that the ‘evolutionary’ approach to the build of the framework was quite important for us. In the early days of the project, at the end of the day all the components were run via a simple process flows, these subsequently went on to be incorporated into the comprehensive smoke test incorporating scripts etc. However, developing the OWB/Unix Smoke tests independendently did not prove too difficult to integrate once complete, so could be used again. I guess we effectively made the OWB components ‘pluggable’.

    Another thing that I think benefits the smoke test is the presence of ‘controls’, by this I am referring to things that remained constant e.g. running with a static set of data for code tests, but when checking data quality ensuring that the code remained constant from a successful execution. i.e. Only change one part of the smoke test so that you know what caused the ‘smoke’!

    The versatility of having a smoke test environment also enabled the performance tuning exercise to become a less labour intensive task than may have been the case otherwise as we had an environment which was available for execution with a control set of data and code. In addition when the time came for productionisation (if there is such a word!), I believe as a direct consequence of smoke testing, the majority of risks were mitigated by the assurance of the process that had been achieved via smoke test. (We had experienced and remedied most of the situations that may have affected the execution).

    In fact, there were benefits all the way for us, so that would certainly be the case for it on future projects for me.

    Irfan.

  8. Tim Berry Says:

    Twas a painful experience for me that I would not like to repeat. There are many reasons for this that I wouldn’t blame smoke tests for. However there are a number of paths that are defined that cause too many phases to come together into a bottleneck that can cause delay and a stressful environment.

    Within the migration project I am on now we are following a field level mapping route that could lead us into a cursor based insert development. So to mitigate against such I am building denormailised views for analysis purposes in the hope that we can use some decision tables to progress set based inserts and move away from normalised processing.

    That said there is still pressure to use the tranditional method that has been implemented in the past. So again whilst we push to use modern technology we don’t want to change our thinking.

    However I am trying to bridge the gap by and provide a smaller learning curve so that a new mechanism is accepted.
    Maybe adapting defined processes to suit the requirements at the time is the way forward. I just feel that in I.T. these days that simple money & tools need consideration before applying to the problem at hand. More often than not we dive head first into the solution without considering the problem from long enough.

  9. Gerald Williams Says:

    Hi Irfan, long time no see. Yes – in a nutshell – the up-front work – saved a shed load of work downstream and gave the team and the management the confidence that the system would work come the big bang – usually you get one chance at a data migration and you have to be highly confident you are ready to push the button – especially when you have a few thousand users waiting to use the new system with the new data on Monday morning after a paralell roll out of applications to desktops spread out over many geographical locations etc etc and the cash burn rate is high!

    The pressure to deliver something ‘early’ needs to be pushed back by the PM (possibly a career limiting exercise in some environments :-)) and the framework needs to be accounted for properly in plans and expectations given to the Project Board. Smoke testing will eventually ensure you get high quality and potentially earlier delivery.

    So Smoke testing – definitely worth it – but there is an overhead associated with it that – if compounded with trying to meet too many objectives such as data provision to other dependent projects too early in the cycle – can be a painful experience. Would I do it again…yes but I would limit smoke Testing to its primary purpose – ensuring the daily build is not broke – during the early stages, and only then would I incrementally increase the amount of data, transactions or whaterver to satisfy other dependent projects requirements.

Website Design & Build: tymedia.co.uk