Materialized View and Partitioning "Gotchas"

April 18th, 2009 by Mark Rittman

Materialized Views and Partitioning are two key Oracle features when working with data warehouses. Using them together though can sometimes cause unexpected problems when you need to refresh them, as we found on a recent project. Here’s what happened, reproduced using the SH Sample Schema.

Our initial requirement was thus: the client had a fact table that was partitioned, one partition per month, and they wanted to create some materialized views that summarized sales across various dimensions. To take an example, the following fact table summarizes sales across products, the bottom level of the product dimension hierarchy.

CREATE MATERIALIZED VIEW LOG ON products WITH SEQUENCE, ROWID
(prod_id, prod_name, prod_desc, prod_subcategory, prod_subcategory_desc,
prod_category, prod_category_desc, prod_weight_class, prod_unit_of_measure,
 prod_pack_size, supplier_id, prod_status, prod_list_price, prod_min_price)
INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW LOG ON sales
WITH SEQUENCE, ROWID
(prod_id, cust_id, time_id, channel_id, promo_id, quantity_sold, amount_sold)
INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW LOG ON times
WITH SEQUENCE, ROWID
(time_id, calendar_quarter_desc)
INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW product_sales_mv
BUILD IMMEDIATE
REFRESH FAST
ENABLE QUERY REWRITE
AS SELECT p.prod_name,
t.calendar_quarter_desc
SUM(s.amount_sold) AS dollar_sales, COUNT(*) AS cnt, COUNT(s.amount_sold) AS cnt_amt
FROM sales s, products p, times t
WHERE s.prod_id = p.prod_id
and s.time_id = t.time_id
GROUP BY p.prod_name, t.calendar_quarter_desc;

I then run a check on the materialized view using DBMS_MVIEW.EXPLAIN_MVIEW, which tells me that the materialized view is indeed fast refreshable.

exec DBMS_MVIEW.EXPLAIN_MVIEW('product_sales_mv');

SELECT capability_name, possible
FROM   mv_capabilities_table;
CAPABILITY_NAME                POSSIBLE
------------------------------ --------
PCT                            Y
REFRESH_COMPLETE               Y
REFRESH_FAST                   Y
REWRITE                        Y
PCT_TABLE                      Y
PCT_TABLE                      N
PCT_TABLE                      N
REFRESH_FAST_AFTER_INSERT      Y
REFRESH_FAST_AFTER_ONETAB_DML  Y
REFRESH_FAST_AFTER_ANY_DML     Y
REFRESH_FAST_PCT               Y
REWRITE_FULL_TEXT_MATCH        Y
REWRITE_PARTIAL_TEXT_MATCH     Y
REWRITE_GENERAL                Y
REWRITE_PCT                    N
PCT_TABLE_REWRITE              N

16 rows selected

So far, so good. Let’s try a fast refresh and see if it works.

EXEC DBMS_MVIEW.REFRESH('product_sales_mv','F');

anonymous block completed.

Great. But then, the client reminds me that the fact table that the materialized view is based on is in fact loaded using partition exchange. A typical load of a quarter’sr data would look like this:

ALTER TABLE sales EXCHANGE PARTITION sales_q3_2003
WITH TABLE sales_staging;

alter table sales succeeded

Now I go to refresh the materialized view again.

EXEC DBMS_MVIEW.REFRESH('product_sales_mv','F');
Error starting at line 1 in command:
exec dbms_mview.refresh('product_sales_mv','F');
Error report:
ORA-32313: REFRESH FAST of "SH"."PRODUCT_SALES_MV" unsupported after PMOPs
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2545
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2751
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2720
ORA-06512: at line 1
32313. 00000 -  "REFRESH FAST of \"%s\".\"%s\" unsupported after PMOPs"
*Cause:    A Partition Maintenance Operation (PMOP) has been performed on a
           detail table, and the specified materialized view does not support
           fast refersh after PMOPs.
*Action:   Use REFRESH COMPLETE.  Note
           materialized view does not support fast refresh after PMOPs using
           the DBMS_MVIEW.EXPLAIN_MVIEW() API.

Oh-oh. What’s gone on here then? Well it turns out that you can’t refresh a fast-refresh materialized view when you perform a partition maintenance operation (PMOP) on the underlying table, which if you think about it is obvious because the associated materialized view logs won’t contain details of what data has changed.

UPDATE: It looks like the above error might be due to a bug that’s fixed in 11.1.0.7 and 10.2.0.5. See “Bug 6342477 – Materialized View refresh error (ORA-32313) after exchange partition” on Metalink for more details. But the error should still happen when you carry out any other partition maintenance operation on the fact table, such as adding, merging or dropping partitions.

FURTHER UPDATE: Strangely enough, on further testing, none of us can actually reproduce using the script above the error that I came up with. Now I actually hit this error on a customer site, and then reproduced it on my laptop, but we’ve tested it again on 11.1.0.6, 11.1.0.7, 10.2.0.4 and 10.2.0.5, and we’re finding now that any partition maintenance option seems to be allowable when we then go on to perform a fast refresh. Very strange, and completely contrary to what I found before. So treat the above comments about PMOPs and Fast Refresh with caution, as it may have been down to a different issue than the one I raised above.

So if we can’t fast refresh our materialized view (UPDATE: but see note above), how about we try and refresh it using partition change tracking (PCT)? PCT has the advantage of not needing materialized view logs to track what data has changed, and instead uses an internal mechanism to track what partitions in the source table have changed, which it then uses to work out what rows to update in the materialized view, which doesn’t even have to be partitioned itself. I therefore create a new version of the materialized view thats FORCE refresh and uses the DBMS_MVIEW.PMARKER function to add a reference to the source partition for each materialized view row, like this:

DROP MATERIALIZED VIEW product_sales_mv;

DROP MATERIALIZED VIEW LOG ON products;
DROP MATERIALIZED VIEW LOG ON sales;
DROP MATERIALIZED VIEW LOG ON times;

CREATE MATERIALIZED VIEW product_sales_mv
BUILD IMMEDIATE
REFRESH FORCE
ENABLE QUERY REWRITE
AS SELECT p.prod_name,
          t.calendar_quarter_desc
          SUM(s.amount_sold) AS dollar_sales,
          COUNT(*) AS cnt,
          COUNT(s.amount_sold) AS cnt_amt,
          DBMS_MVIEW.PMARKER(s.rowid)
FROM      sales s,
          products p,
          times t
WHERE     s.prod_id = p.prod_id
and       s.time_id = t.time_id
GROUP BY  p.prod_name, t.calendar_quarter_desc, DBMS_MVIEW.PMARKER(s.rowid);

After checking back with DBMS_MVIEW.EXPLAIN_MVIEW to verify that PCT refresh is possible, I then reload the SALES table using partition exchange.

ALTER TABLE sales EXCHANGE PARTITION sales_q2_2003
WITH TABLE sales_staging;

and then refresh it using PCT refresh.

EXEC DBMS_MVIEW.REFRESH('product_sales_mv','P');

anonymous block completed.

Good stuff, and we now don’t even need the overhead of materialized view logs as well. But the unexpected “gotcha” with this comes when we start to update the dimension tables as well.

UPDATE products
SET     prod_name = 'Y Box Ultra'
WHERE prod_id = 16;

commit;

EXEC DBMS_MVIEW.REFRESH('product_sales_mv','P');

Error starting at line 1 in command:
EXEC DBMS_MVIEW.REFRESH('product_sales_mv','P');
Error report:
ORA-12052: cannot fast refresh materialized view SH.PRODUCT_SALES_MV
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2545
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2751
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2720
ORA-06512: at line 1
12052. 00000 -  "cannot fast refresh materialized view %s.%s"
*Cause:    Either ROWIDs of certain tables were missing in the definition or
           the inner table of an outer join did not have UNIQUE constraints on
           join columns.
*Action:   Specify the FORCE or COMPLETE option. If this error is got
           during creation, the materialized view definition may have be
           changed. Refer to the documentation on materialized views.

So what’s happened here? Well if you check in the Data Warehouse Guide in the product documentation, it turns out that PCT refresh isn’t possible when DML has happened on the non-partitioned table, which again is sort of obvious when you think about it as there are no partitions to track on those tables.

Now I had a chat with Stewart Bryson, our Technical Director over in the USA, about this as he’s one of our materialized view experts, and he confirmed this as being his experience as well. So we’re in the situation now where:

  • Fast refresh works on materialized views, except where the underlying fact table is loaded using partition exchange loading
  • Partition Change Tracking is however possible and doesn’t mind when the underlying table has partition maintenance operations applied to it,
  • however PCT isn’t allowed when the MV has joins to other tables in it which are subsequently updated through DML operations

The solution we chose in the end was to create our materialized views just against the fact table, and include just the keys to the dimension tables plus the measures we wanted to aggregate. Whilst this didn’t allow us to aggregate right up the dimension hierarchies, it did allow us to summarize sales by product, day, customer and so on, which if we combined with a set of CREATE DIMENSION statements together with the ability of the optimizer to “join back” the materialized view to the dimension tables to retrieve the columns we were filtering, actually still gave us very good performance with the benefit of our MV now being partition change tracking refreshable.

So, a few things to catch you out there. Everything within the solution worked fine until we tried to refresh the materialized view, when we found that certain load options for the fact table and dimensions weren’t compatible with the refresh choices we’d made for the materialized view.

Comments

  1. Peter Scott Says:

    Hi Mark – this was one of the inhibitors to PCT refresh I had with one of my old customers – we even looked at the option of doing two refreshes one after the dimension updates and a PCT one after fact load. In the end I wrote my own partition aware refresh code to track changed partitions and truncate / append insert the new fact. David Aldridge (my new neighbour!) wrote about similar techniques too

  2. Stewart Bryson Says:

    Mark: good stuff. This is the approach I usually take as well. The join back to the dimensions still gives good performance.

    Another option I have done in the past is to create the materialized view on a prebuilt table, and manage the refresh of the aggregate myself. So basically, before the load, I drop the materialized view, which leaves the underlying table intact. Then, I load the new aggregate values into the underlying table, which usually means truncating the current partition and re-inserting the entire contents for that range. Since I can programatically determine which partitions are affected, I can determine which partitions to refresh. After that, I again issue the CREATE MATERIALIZED VIEW… ON PREBUILT TABLE statement.

  3. Peter Scott Says:

    @Stewart – rather than drop and recreate after refresh there is also the option to ALTER MATERIALIZED VIEW xxx CONSIDER FRESH.

  4. Stewart Bryson Says:

    Agreed Peter… much better approach. I believe when I took this approach before, I was unaware the materialized view could be truncated and reinserted while it was still a materialized view. I’m going to steal this approach, if you don’t mind. :)

  5. VGOPAL Says:

    Hi Mark,
    It seems this approach is not working if we multiple sales records for a product in one quarter. It is grouping at the row_id level rather than at the quarter level. can you help to get the total sales in this scenario?

  6. Peter Scott Says:

    @VGOPAL
    I am not sure that I understand your point – if you are creating a MV at quarter level you would not include the source row_id in the aggregation. So in the simplest case for a SALES fact summary MV that just has two dimensions (product and date) you would group by PRODUCT and QUARTER – there will be only one row per product per quarter (assuming there was a sale in the quarter for that product)

  7. Shveta Narang Says:

    Hi Mark,

    I have a problem with one of my materialized views. My base table is range partitioned and so i have similarly partitioned my materialized view. It is allowing PCT refresh. I have created materialized view with REFRESH FAST constraint.

    If I use atomic_refresh=FALSe and also set session rule ALTER SESSION SET “_mv_refresh_costing” = ‘rule_pt_pd_fa_co’;,

    and if there is a change/alter table truncate in single partition on base table, while refreshing it goes and truncate all the partitions ..

    can you please help what could be causing this ?

Write a comment





Website Design & Build: tymedia.co.uk