Resuming your ETL process in OWB
May 31st, 2008 by Jon Mead
If your ETL process fails, for whatever reason, you have two options: to restart the process from the beginning, or to resume from the last successfully completed task.
Restarting from beginning
It is reasonable easy to restart the process in OWB, you would have to wait until your top level process flow has completed and then if your target tables have loading type of TRUNCATE/INSERT, or you use the WB_TRUNCATE public transformation to achieve the same effect, then you can restart. This restart is however from the beginning of the process flow and so may involve you re-running hours of ETL that has already ran correctly. If the duration of your ETL process is under an hour this could be acceptable, however if your ETL process is more like 6 hours, and only has an 8 hour window this is more of a problem.
Resuming the process
OWB has no built in functionality to resume process flows. One approach we have used is to use a transformation to test whether a mapping has already executed for a run before executing it. This is done in a process flow by using a transformation to test whether the mapping should be executed. Hence each mapping needs the following activities:
- A Transformation to test the mapping
- The mapping itself
- An OR activity to collect the results
Each mapping looks like the one below.

The result of the transformation should be tested in the transition and the required execution path chosen. In the example above, the transformation returns 0 (zero) if the mapping is to be run, or any positive integer if it is not. Any other value is treated as an error and will cause the process flow to fail. This can be implemented by using a complex condition in the transition.

The OR activity is then required to allow execution to continue, whichever execution path reaches it first will cause execution to continue, as the transformation uses mutually exclusive conditions this will mean in reality that only one execution path can reach it.
Controlling the execution with an ID
In the example above the transformation can only determine if the the mapping has already run by supplementing the logging that is provided by OWB. The transformation must be aware of the logical execution that is being run. In a real world example the logical run would relate to one data load into the Data Warehouse. The execution ID identifies a logical run and is set at the beginning of a run and can be passed between each process flow, sub process flow and/or mapping as a parameter.
In the example above the each mapping has been configured to log to a table its start time, the execution ID and, if the mapping completes successfully, the end time. This can be achieved by pre and post mapping process. An example of this is shown below.

Using this technique transformation activity can always establish whether a mapping has been executed successfully for this particular run, this is helped by setting the post mapping process to only run ‘ON SUCCESS’ which can be configured from within a mapping, hence you could ensure the end time would only be populated if the mapping completed successfully.
Number of errors
If you are executing your mapping in set-based mode the mapping activity parameter MAX_NO_OF_ERRORS is set to 0 (zero). If you leave it with the default of 50, it seems that if the mapping errors then this only counts as one error, so the post mapping process executes as one is less than 50, and for that reason it thinks the ON SUCCESS condition has been met. Set the parameter as follows:

Forcing mappings to execute
There may be certain instances where you want a mapping to execute regardless, or you want to re-execute the mappings during the course of a run. In the first instance mappings can you added to process flows without the transformation and OR activity and then they will just execute as per normal. In the second instance the you can add another parameter to the transformation to force the following mapping to execute, regardless if it already has or not.

Automation using OMB Plus
Finally it is worth looking at the overhead this adds to the development process, you now have to add three activities to a process flow and configure transitions, whereas before you simply had to add a single mapping. One way round this we have used is to use OMB Plus scripts to either generate the process flows for a list of mappings, or to add all three activities to a process flow.


June 2nd, 2008 at 10:57 am
A good reminder about setting the number of errors to zero for production OWB maps. I always think that “expected” exceptions should be coded for in the mapping and anything else that causes an error should result in the mapping reporting the failure as it not an expected condition.
I will be writing a posting shortly on using external scheduler programs to run OWB mappings in the next few days – there are a lot of similar points.
June 2nd, 2008 at 3:23 pm
Jon,
Have you looked at the “Notifications” feature in the Enterprise ETL Option for OWB10gR2/11g? They allow a process flow to send a notification, via email, to someone to tell them that a process flow has failed. The notification can then be replied to (yes|no) with the process flow then branching depending on the result, in theory restarting the process at the point it failed.
I haven’t tested it out yet, and of course it requires the paid for Enterprise ETL Option. To me though it theoretically offered a way for failed process flows to restart at the point of failure.
regards, Mark
p.s. Borkur, you’re next…
June 2nd, 2008 at 3:44 pm
We struggled with this, and came up with a slightly modified approach to yours.
Similar to you, we do use our own supplemental logging of each datamart run including start times and final status of each mapping done through pre- and post-mapping procedures.
We then created a “run_map(mapname)” function which checked to determine if this mapping needed to be run this time, and then executed it using dynamic SQL if needed. The function passed back the return value from the mapping’s main() function, so that the using return value as status would work seamlessly in the process flows.
We then used the function call exclusively in our process flows to avoid cluttering them up with the complex processes that you required. We just labelled each call to our function according to the mapping to be called to make for a clear visual design.
This also had several interesting benefits such as not needing to re-synch a dropped and recreated mapping in the process flow as it would be referenced by name at runtime – not by ID at design-time, and it provided the capacity to significantly alter our conditional run logic by touching just one location. This proved quite effecient when we decided to add an option to force the full re-run of all mappings if neccessary. We changed that logic in the function only, and it propogated seamlessly across our project.
Cheers,
Mike
June 3rd, 2008 at 3:10 pm
I wish oracle would release information about how to really use OWB vs pushing new features that no one really uses.
Cudos I really like the info you are presenting here.
June 6th, 2008 at 8:41 am
@Mike – Your approach sounds good, I had seen a posting like that on the OWB forums. The syncing issue is/can be a problem – we managed to circumnavigate it by generating most of the process flows using OMB, and also we built a sync script in OMB.
Jon