Collaborate'09, and Optimizing the Performance of the BI Apps Data Warehouse

My second session for Collaborate'09 was entitled "Optimizing the Performance of the Oracle BI Applications Data Warehouse", and focused on applying Oracle data warehouse optimizations to the platform-independent data warehouse data model that ships with the BI Applications. This data warehouse features a large number of fact and dimension tables that use bitmap indexes, b-tree indexes and not a lot else to hold your data and optimize queries. My session looked at what happens if you use segment compression, partitioning and materialized views with your Oracle BI Applications data warehouse, and more importantly how do you make sure you can still manage it all using the DAC Console.

To show how the process works, I took a sample table (W_SALES_INVOICE_LINE_F) and its corresponding aggregate table (W_SALES_INVOICE_LINE_A), ran some benchmark tests to record how long they normally took to load, and how much disk space they took up, and then recreated them using regular compression.

Pic1-1

Using compression brought the size of the fact table down from around 190MB to around 40MB, a significant saving in disk space that will also benefit queries that require full table scans. The load time went down a bit as well, from around 900 seconds to 880 seconds, nothing significant but also not an increase in time.

The next scenario we looked at was introducing partitioning to the fact table. Partitioning has a number of benefits including manageability (you can offline partitions, back them up and mark them as read-only), query performance (partition elimination) and ETL (partition exchange, partition change tracking refresh of summaries) but is not supported "out of the box" with the DAC Console. What you can do though is upgrade the DAC that ships with the BI Applications to the 10.1.3.4.1 version thats downloadable from OTN, and make use of a new feature called "Actions" that allows you to override the normal creation process for indexes and tables, and allows you to call SQL and PL/SQL routines before and after a DAC task to, for example, drop and recreate the table using partitioning.

Pic2

Used in conjunction with index actions that we then use to get the DAC to create some bitmap indexes as LOCAL, we can efficiently use partitioning and still keep management of the ETL process wholly within the DAC environment.

Pic3

Performing some more benchmark tests, the size of the partitioned and compressed table was marginally greater than the unpartitioned, compressed table (44MB compared to 43MB), but queries that could benefit from partition elimination had a significantly improved response time. So far, so good.

The last scenario we went through involved replacing the summary table associated with the fact table, with a materialized view that was fast refreshable. The BI Applications currently creates aggregates as regular tables that it loads and refreshes just like any other table. Oracle, however, as a feature called materialized views that have the potential to be far quicker to refresh through logs, or partition change tracking, that keep track of what data has changed in the underlying table.

Adding materialized views to the BI Apps data warehouse involves a bit more work, as you first have to extract the SQL query used to populate the original aggregate table from the relevant Informatica mapping, and then amend this so that it can be turned into a fast refresh materialized view.

Pic4

Then, you get the DAC to create this materialized view and its logs instead of the original aggregate table through a number of new task actions, that you then tell the DAC to execute when you load the aggregate table using the FULL mode.

Pic5

Once the materialized view is created, you then need to create additional task actions that call DBMS_MVIEW.REFRESH to fast refresh the materialized view, taking care to disable any table truncation steps that might give you an ORA-32320 otherwise. Adding this optimization brought the time for the aggregate refresh down from a minute or so to a few seconds, and overall the time required to load the two tables fell by about 55% whilst the disk space fell by about 70%.

So there you have it. The full presentation and white paper are now available for download on our website, including step-by-step instructions on how to use these new DAC actions. Tomorrow is the last day of the conference, and I'm starting off at 8.30 delivering a session with Mike Durran on behalf of Michael Armstrong-Smith, who had to cancel his travel plans at the last minute, and after that it's off to the airport to fly back to the UK. In the meantime you can catch up on my conference observations using twitter (@markrittman), and of course straight afterwards we start preparing for our own conference, the Rittman Mead BI Forum in Brighton.