Bye-bye Oracle 9.2

Well the detailed planning for a customer's data warehouse upgrade paid off. We kept mainly to plan until the customer tried using their query tool. We then spent what seemed an age, late at night, tracking down a problem which almost had the customer deciding to revert to the pre-upgrade backup. Eventually this was tracked down to someone on the customer site deciding to pretty-up a database descriptor and not realising that the name actually had to match the one in the query-server's TNSNAMES.ORA file. Of course, this took a long time to track as nobody was supposed (or needed) to change this. Eventually we found the problem clearly flagged in the application's SQLNET.LOG file. One quick change later and away we went in to a full test of user queries and especially those that triggered optimiser bugs when we did a 9.2 patch-set release last time around.

Two niggles that the customer reported, the order of items in list-of-values drop-downs in the query tool has changed and the scheduled job that collects optimiser statistics clashes with the timing of our hand crafted statistics collection code. The first of these I have no sympathy with at all. Oracle (and I don't think any other vendor has either) has never guaranteed row order on a select query that does not include an ORDER BY; it was just that Oracle 9.2 and earlier may have used a sort operation in the query that is no longer deemed necessary by the 10gR2 optimiser. The customer just needs to the right thing - include an ORDER BY if they want an ordered list; we could give them an ordered view but that is more work for everyone; we have to create the view, grant access, test it and document it, they have to amend their query tool to remove the access to the original table and replace it with a view.

The statistics collection problem is relatively simple to fix, we just just disable Oracle's scheduled job. Our customer runs a chain of stores. The stores close at night, send their sales figures to head office and then to on the data warehouse to arrive between 00:00 and 04:00; the last thing I want to do in my heavy batch load window is to analyze the 'random' selection of tables that the database thinks needs attention at that time. I would prefer to use the scripted approach we developed a while back that collects the right form of statistics (based on our knowledge of data distribution) at a time we know to be correct.