Real-world Feedback On OWB Performance Tuning

Earlier this year I was asked to take a look at how we might use 10046 trace files to help carry out OWB performance tuning, the idea being that by automatically generating trace files, we could identify why certain mappings and process flows were taking too long to run. A reader was interested in the technique at the time and tried it out on an OWB implementation he was working on. The other day he sent me some feedback on how the approach went. I thought it'd be interesting as a bit of "real-world experience" on carrying out OWB performance tuning.

"Bit of follow up that I've been meaning to write re the statistics gathering/event tracing in mappings that Mark wrote up on his blog a few months back.

Further consideration has shown that this was not the correct approach for this project, reasons as follows. We undertook a strategy to run a daily smoke test (http://www.stevemcconnell.com/bp04.htm) off the full process based around different test packs. This enabled us to test volumes, specific cases etc etc. With daily event 10046 stats turned on (the full process executed 300 sql scripts and ~100 mappings) we found the following:

(1) There was too much information to review.

(2) We kept running out of disk space because of the Gb's of log files - thus actually stopping the jobs running as the database errored.

The approach now taken is the take timings against volumes for each of the jobs run, and the extrapolate the estimated run time for full volume. We then have a daily/weekly meeting to review the top 10 cases and on the basis of this focus any performance and tuning activity as required. I believe this to be a more practical approach, and also follows to a degree the approach describe by C Millsap et al in the optimising Oracle book, i.e. focus your tuning efforts to provide maximum benefit from the resource (developer time) you assign to it. Depending of the mapping to be tuned we will then determine if event 10046 tuning is required, it may be a case of logic rewrite or peer review may solve the issue.

I do not mean to devalue the work we did earlier in the year, but like everything theory and reality are often different, and sometimes the old fashioned look at the query/explain plan and realise no indexes are being used is enough.

The greatest asset has been the daily smoke tests this has give us a historic record of timings, if we do get unexpected variances in runtimes we can use these figures to determine if its related to database config and utilisation, or whether it relates to code releases (all releases are logged and versioned).

Mark - may be worth updating the blog - I can write something if you like, or just quote the above if you think it's appropriate; I think it would be useful to share though."

Thanks for the feedback. I think what he is saying here is that you shouldn't really by default trace all mappings, as the hit on performance and on disk usage is too great. If as he says though, you concentrate tracing on just those mappings that are affecting "response time" - as he says, determined through the smoke test approach he used - then it can still be a useful approach, although you shouldn't forget the simple approach of just generating an intermediate mapping and then running an explain plan on it.