Cube, Rollup, Grouping Sets and Grouping IDs

If you've kept up with the various new data warehousing features in Oracle over the past few years, you might have noticed extensions to the GROUP BY clause such as ROLLUP, CUBE, GROUPING SETS and so on. If like me you've glanced at these features and mentally noted them as being potentially useful, you might have been put off by the fact that they look fiendishly complicated to use. Like me you might also have questioned whether SQL query tools such as OBIEE would ever be able to use them, given that not many query tools generate SQL that uses these extensions. So what are they actually used for, can they ever benefit "regular" query tools such as OBIEE, and in what circumstances can they be used to make aggregation in your data warehouse more efficient?

If you think of the data in your relational star schema as being a "logical cube", with its measures potentially being available across all levels of aggregation, the actual data stored in your relational star schema is often a small fraction of this, being based on the detail-level data extracted from your source systems. If you wanted to report on data at a higher level of aggregation than this stored, detail-level view, you would create a SELECT statement that listed out the columns to aggregate by, the measures to aggregate along with aggregation functions, an optional filter to restrict down the column values that aggregates are calculated for, and a GROUP BY clause to tell Oracle what columns to group (aggregate) the data on. This would then tell Oracle to display these specific aggregations in your report, with the aggregated data being dynamically calculated at run-time.

To take an example run against the SH schema on an Oracle 11g (11.1) database, the following SELECT statement aggregates amount_sold by channel, month and country ISO code and returns results for a selection of these column values:

SQL> SELECT
  2          ch.channel_desc,
  3          t.calendar_month_desc,
  4          co.country_iso_code,
  5          SUM(s.amount_sold) sum_amount_sold
  6  FROM
  7          sales s,
  8          customers cu,
  9          times t,
 10          channels ch,
 11          countries co
 12  WHERE
 13          s.time_id=t.time_id AND
 14          s.cust_id=cu.cust_id AND
 15          cu.country_id = co.country_id AND
 16          s.channel_id = ch.channel_id AND
 17          ch.channel_desc IN ('Direct Sales', 'Internet') AND
 18          t.calendar_month_desc IN ('2001-09', '2000-10') AND
 19          co.country_iso_code IN ('GB', 'US')
 20  GROUP BY
 21          ch.channel_desc,
 22          t.calendar_month_desc,
 23          co.country_iso_code;

CHANNEL_DESC CALENDAR CO SUM_AMOUNT_SOLD


Direct Sales 2000-10 US 682296.59
Internet 2000-10 US 137054.29
Internet 2001-09 GB 36806.73
Direct Sales 2000-10 GB 91925.43
Internet 2001-09 US 299621.96
Direct Sales 2001-09 US 621197.94
Internet 2000-10 GB 14539.14
Direct Sales 2001-09 GB 92865.04

8 rows selected.

So far so good. But what if you wanted to generate subtotals for the columns that you are filtering on, plus perhaps a grand total for all columns? You could put the results of the first GROUP BY into a temporary table and then aggregate the results again, or you could potentially base one SELECT statement on the results of another; or, you could use the ROLLUP extension to GROUP BY to calculate these subtotals and grand totals automatically.

SQL> SELECT
  2          ch.channel_desc,
  3          t.calendar_month_desc,
  4          co.country_iso_code,
  5          SUM(s.amount_sold) sum_amount_sold
  6  FROM
  7          sales s,
  8          customers cu,
  9          times t,
 10          channels ch,
 11          countries co
 12  WHERE
 13          s.time_id=t.time_id AND
 14          s.cust_id=cu.cust_id AND
 15          cu.country_id = co.country_id AND
 16          s.channel_id = ch.channel_id AND
 17          ch.channel_desc IN ('Direct Sales', 'Internet') AND
 18          t.calendar_month_desc IN ('2001-09', '2000-10') AND
 19          co.country_iso_code IN ('GB', 'US')
 20  GROUP BY
 21          ROLLUP(
 22                  ch.channel_desc,
 23                  t.calendar_month_desc,
 24                  co.country_iso_code);

CHANNEL_DESC CALENDAR CO SUM_AMOUNT_SOLD


Internet 2000-10 GB 14539.14
Internet 2000-10 US 137054.29
Internet 2000-10 151593.43
Internet 2001-09 GB 36806.73
Internet 2001-09 US 299621.96
Internet 2001-09 336428.69
Internet 488022.12
Direct Sales 2000-10 GB 91925.43
Direct Sales 2000-10 US 682296.59
Direct Sales 2000-10 774222.02
Direct Sales 2001-09 GB 92865.04
Direct Sales 2001-09 US 621197.94
Direct Sales 2001-09 714062.98
Direct Sales 1488285
1976307.12

15 rows selected.

This is called "full rollup" and generates subtotals for all the columns listed in the ROLLUP() clause. You can also use the ROLLUP() clause to perform partial rollups, with columns being added to the GROUP BY clause and excluded from the ROLLUP() clause to stop them from being subtotalled.

SQL> SELECT
  2          ch.channel_desc,
  3          t.calendar_month_desc,
  4          co.country_iso_code,
  5          SUM(s.amount_sold) sum_amount_sold
  6  FROM
  7          sales s,
  8          customers cu,
  9          times t,
 10          channels ch,
 11          countries co
 12  WHERE
 13          s.time_id=t.time_id AND
 14          s.cust_id=cu.cust_id AND
 15          cu.country_id = co.country_id AND
 16          s.channel_id = ch.channel_id AND
 17          ch.channel_desc IN ('Direct Sales', 'Internet') AND
 18          t.calendar_month_desc IN ('2001-09', '2000-10') AND
 19          co.country_iso_code IN ('GB', 'US')
 20  GROUP BY
 21          ch.channel_desc,
 22          ROLLUP(
 23                  t.calendar_month_desc,
 24                  co.country_iso_code);

CHANNEL_DESC CALENDAR CO SUM_AMOUNT_SOLD


Internet 2000-10 GB 14539.14
Internet 2000-10 US 137054.29
Internet 2000-10 151593.43
Internet 2001-09 GB 36806.73
Internet 2001-09 US 299621.96
Internet 2001-09 336428.69
Internet 488022.12
Direct Sales 2000-10 GB 91925.43
Direct Sales 2000-10 US 682296.59
Direct Sales 2000-10 774222.02
Direct Sales 2001-09 GB 92865.04
Direct Sales 2001-09 US 621197.94
Direct Sales 2001-09 714062.98
Direct Sales 1488285

14 rows selected.

Notice how this partial rollup returns one less row, as it excludes the subtotal of amount sold by the internet channel and the direct sales channel combined , i.e. it excludes the channel subtotal.

A more typical use of ROLLUP would be to produce subtotals and grand totals across a single dimension hierarchy, rolling up sales for example by month, quarter and year within a single SELECT statement. Note how in the results below the amount sold gets aggregated by month first, then by all the months in each quarter, and then across all months and quarters in the year.

SQL> SELECT
  2     t.calendar_month_desc "MONTH",
  3     t.calendar_quarter_desc "QUARTER",
  4          t.calendar_year "YEAR",
  5     SUM(s.amount_sold) sum_amount_sold
  6  FROM
  7     sales s,
  8     customers cu,
  9     times t,
 10     channels ch,
 11     countries co
 12  WHERE
 13     s.time_id=t.time_id AND
 14     s.cust_id=cu.cust_id AND
 15     cu.country_id = co.country_id AND
 16     s.channel_id = ch.channel_id AND
 17     ch.channel_desc IN ('Direct Sales', 'Internet') AND
 18     t.calendar_year = 1999 AND
 19     co.country_iso_code IN ('GB', 'US')
 20  GROUP BY
 21     ROLLUP(
 22             t.calendar_year,
 23             t.calendar_quarter_desc,
 24             t.calendar_month_desc);

MONTH QUARTER YEAR SUM_AMOUNT_SOLD


1999-01 1999-01 1999 974627.95
1999-02 1999-01 1999 1089255.92
1999-03 1999-01 1999 754026.7
1999-01 1999 2817910.57
1999-04 1999-02 1999 708060.57
1999-05 1999-02 1999 818055.52
1999-06 1999-02 1999 729677.52
1999-02 1999 2255793.61
1999-07 1999-03 1999 893452.47
1999-08 1999-03 1999 883460.92
1999-09 1999-03 1999 923577.01
1999-03 1999 2700490.4
1999-10 1999-04 1999 715831.36
1999-11 1999-04 1999 742248.42
1999-12 1999-04 1999 841572.17
1999-04 1999 2299651.95
1999 10073846.5
10073846.5

18 rows selected.

To describe the ROLLUP feature in more detail then, the full rollup returns rows representing the regular aggregations you'd get from a GROUP BY, subtotals for each column in the ROLLUP column list with the last column listed repeating most frequently, and a grand total for all columns in the ROLLUP list. Partial ROLLUP does the same but excludes the colum listed in the GROUP BY clause but excluded from the ROLLUP clause.

So ROLLUP is useful for creating subtotals and totals for the columns that your aggregating by in a SELECT statement, with NULLs in the listed column values indicating that the aggregated value is a subtotal or grand total. Not too difficult a concept to understand, but where then does the CUBE extension to GROUP BY come in?

What the CUBE extension to GROUP BY does it to calculate every possible aggregate permutation of the columns listed in the CUBE clause. This is particularly useful when you want to get hold of a whole set of aggregations in a single go, perhaps to pre-populate a summary table or as the input into a further reporting process. To take the example I used earlier on where I first GROUPed by channel, country and month, and then selected the same data and ran it through the ROLLUP extension, doing the same but with the CUBE extension to GROUP BY brings back the following results set:

SQL> SELECT
  2          ch.channel_desc,
  3          t.calendar_month_desc,
  4          co.country_iso_code,
  5          SUM(s.amount_sold) sum_amount_sold
  6  FROM
  7          sales s,
  8          customers cu,
  9          times t,
 10          channels ch,
 11          countries co
 12  WHERE
 13          s.time_id=t.time_id AND
 14          s.cust_id=cu.cust_id AND
 15          cu.country_id = co.country_id AND
 16          s.channel_id = ch.channel_id AND
 17          ch.channel_desc IN ('Direct Sales', 'Internet') AND
 18          t.calendar_month_desc IN ('2001-09', '2001-10') AND
 19          co.country_iso_code IN ('GB', 'US')
 20  GROUP BY
 21          CUBE(
 22                  ch.channel_desc,
 23                  t.calendar_month_desc,
 24                  co.country_iso_code);

CHANNEL_DESC CALENDAR CO SUM_AMOUNT_SOLD


                                  2117845.22
                          GB       243978.97
                          US      1873866.25
                 2001-09          1050491.67
                 2001-09  GB       129671.77
                 2001-09  US        920819.9
                 2001-10          1067353.55
                 2001-10  GB        114307.2
                 2001-10  US       953046.35

Internet 761766
Internet GB 75817.49
Internet US 685948.51
Internet 2001-09 336428.69
Internet 2001-09 GB 36806.73
Internet 2001-09 US 299621.96
Internet 2001-10 425337.31
Internet 2001-10 GB 39010.76
Internet 2001-10 US 386326.55
Direct Sales 1356079.22
Direct Sales GB 168161.48
Direct Sales US 1187917.74
Direct Sales 2001-09 714062.98
Direct Sales 2001-09 GB 92865.04
Direct Sales 2001-09 US 621197.94
Direct Sales 2001-10 642016.24
Direct Sales 2001-10 GB 75296.44
Direct Sales 2001-10 US 566719.8

27 rows selected.

So you can see in the SQL output that there's a lot more NULL values in the dimension columns, indicating the totals and subtotals generated by adding CUBE to the GROUP BY clause. So ROLLUP will create subtotals and totals for dimension values individually, whereas CUBE will create further aggregate permutations by grouping in this case two of the columns together and then calculating the subtotal against just those.

Where we are then is that ROLLUP effectively creates a certain set of subtotals, CUBE goes one further and generates even more permutations. ROLLUP is typically used to aggregate up a hierarchy whilst CUBE is used to generate all the aggregations for a selection of columns. If you're particularly keen on this and want to generate multiple ROLLUPs per SELECT statement, you can even create concatenated ROLLUPs in Oracle 10g and higher.

SQL> SELECT
  2          ch.channel_total,
  3          ch.channel_class,
  4          co.country_region,
  5          co.country_iso_code,
  6          SUM(s.amount_sold) sum_amount_sold
  7  FROM
  8          sales s,
  9          customers cu,
 10          times t,
 11          channels ch,
 12          countries co
 13  WHERE
 14          s.time_id=t.time_id AND
 15          s.cust_id=cu.cust_id AND
 16          cu.country_id = co.country_id AND
 17          s.channel_id = ch.channel_id AND
 18          t.calendar_month_desc IN ('2001-09', '2001-10') AND
 19          co.country_iso_code IN ('GB', 'US')
 20  GROUP BY
 21          ROLLUP(
 22                  ch.channel_total,
 23                  ch.channel_class),
 24          ROLLUP(
 25                  co.country_region,
 26                  co.country_iso_code);

CHANNEL_TOTAL CHANNEL_CLASS COUNTRY_REGION CO SUM_AMOUNT_SOLD


                               Europe               GB       321244.43
                               Europe                        321244.43
                               Americas             US      2603472.57
                               Americas                     2603472.57
                                                               2924717

Channel total Europe GB 321244.43
Channel total Europe 321244.43
Channel total Americas US 2603472.57
Channel total Americas 2603472.57
Channel total 2924717
Channel total Direct Europe GB 168161.48
Channel total Direct Europe 168161.48
Channel total Direct Americas US 1187917.74
Channel total Direct Americas 1187917.74
Channel total Direct 1356079.22
Channel total Others Europe GB 77265.46
Channel total Others Europe 77265.46
Channel total Others Americas US 729606.32
Channel total Others Americas 729606.32
Channel total Others 806871.78
Channel total Indirect Europe GB 75817.49
Channel total Indirect Europe 75817.49
Channel total Indirect Americas US 685948.51
Channel total Indirect Americas 685948.51
Channel total Indirect 761766

25 rows selected.

In the same vein, you can include all dimensions and all levels within a CUBE clause and calculate every possible aggregation in a logical dataset, although the output from this is likely to get pretty large if you cube has several dimension with complex hierearchies.

One way that you can make the output from ROLLUP and CUBE queries a bit more readable, is to use the GROUPING function to return a "1", which you can then transform using a further DECODE function, to replace the NULLs that CUBE and ROLLUP return.

SQL> SELECT
  2     DECODE(GROUPING(ch.channel_desc),1,'All Channels',ch.channel_desc),
  3     DECODE(GROUPING(t.calendar_month_desc),1,'All Months',t.calendar_month_desc),
  4     DECODE(GROUPING(co.country_iso_code),1,'All Countries',co.country_iso_code),
  5     SUM(s.amount_sold) sum_amount_sold
  6  FROM
  7     sales s,
  8     customers cu,
  9     times t,
 10     channels ch,
 11     countries co
 12  WHERE
 13     s.time_id=t.time_id AND
 14     s.cust_id=cu.cust_id AND
 15     cu.country_id = co.country_id AND
 16     s.channel_id = ch.channel_id AND
 17     ch.channel_desc IN ('Direct Sales', 'Internet') AND
 18     t.calendar_month_desc IN ('2001-09', '2000-10') AND
 19     co.country_iso_code IN ('GB', 'US')
 20  GROUP BY
 21     ROLLUP(
 22             ch.channel_desc,
 23             t.calendar_month_desc,
 24             co.country_iso_code);

DECODE(GROUPING(CH.C DECODE(GRO DECODE(GROUPI SUM_AMOUNT_SOLD


Internet 2000-10 GB 14539.14
Internet 2000-10 US 137054.29
Internet 2000-10 All Countries 151593.43
Internet 2001-09 GB 36806.73
Internet 2001-09 US 299621.96
Internet 2001-09 All Countries 336428.69
Internet All Months All Countries 488022.12
Direct Sales 2000-10 GB 91925.43
Direct Sales 2000-10 US 682296.59
Direct Sales 2000-10 All Countries 774222.02
Direct Sales 2001-09 GB 92865.04
Direct Sales 2001-09 US 621197.94
Direct Sales 2001-09 All Countries 714062.98
Direct Sales All Months All Countries 1488285
All Channels All Months All Countries 1976307.12

15 rows selected.

Still with me? Good. The final part of this puzzle is around three more extensions called GROUPING_ID, GROUPING SETS and GROUP_ID.

GROUPING_ID is something you can attach to a SELECT column list to generate a predictable "bit vector ID" for each aggregation produced by a ROLLUP or CUBE, like this:

SQL> SELECT
  2          ch.channel_desc,
  3          t.calendar_month_desc,
  4          co.country_iso_code,
  5          SUM(s.amount_sold) sum_amount_sold,
  6          GROUPING_ID(
  7                  ch.channel_desc,
  8                  t.calendar_month_desc,
  9                  co.country_iso_code) grouping_id
 10  FROM
 11          sales s,
 12          customers cu,
 13          times t,
 14          channels ch,
 15          countries co
 16  WHERE
 17          s.time_id=t.time_id AND
 18          s.cust_id=cu.cust_id AND
 19          cu.country_id = co.country_id AND
 20          s.channel_id = ch.channel_id AND
 21          ch.channel_desc IN ('Direct Sales', 'Internet') AND
 22          t.calendar_month_desc IN ('2001-09', '2001-10') AND
 23          co.country_iso_code IN ('GB', 'US')
 24  GROUP BY
 25          ROLLUP(
 26                  ch.channel_desc,
 27                  t.calendar_month_desc,
 28                  co.country_iso_code);

CHANNEL_DESC CALENDAR CO SUM_AMOUNT_SOLD GROUPING_ID


Internet 2001-09 GB 36806.73 0
Internet 2001-09 US 299621.96 0
Internet 2001-09 336428.69 1
Internet 2001-10 GB 39010.76 0
Internet 2001-10 US 386326.55 0
Internet 2001-10 425337.31 1
Internet 761766 3
Direct Sales 2001-09 GB 92865.04 0
Direct Sales 2001-09 US 621197.94 0
Direct Sales 2001-09 714062.98 1
Direct Sales 2001-10 GB 75296.44 0
Direct Sales 2001-10 US 566719.8 0
Direct Sales 2001-10 642016.24 1
Direct Sales 1356079.22 3
2117845.22 7

15 rows selected.

The GROUPING ID is based on a bit vector algorithm, that means you can predict what value GROUPING_ID will be generated for each aggregation. To take the example above, where we are rolling up channel, month and country, the values of GROUPING_ID would come out as follows:

Aggregation                               Bit Vector    GROUPING_ID
-----------------                         ------------- -----------
Channel, Month, Country                   0 0 0         0
Channel, Month, All Countries             0 0 1         1
Channel, All Months, All Countries        0 1 1         3
All Channels, All Months, All Countries   1 1 1         7

Now once you know what values will get generated for GROUPING_ID, you can then add a HAVING clause to your GROUP BY and just retrieve the aggregation you want from your ROLLUP, like this:

SQL> SELECT
  2          ch.channel_desc,
  3          t.calendar_month_desc,
  4          co.country_iso_code,
  5          SUM(s.amount_sold) sum_amount_sold,
  6          GROUPING_ID(
  7                  ch.channel_desc,
  8                  t.calendar_month_desc,
  9                  co.country_iso_code) grouping_id
 10  FROM
 11          sales s,
 12          customers cu,
 13          times t,
 14          channels ch,
 15          countries co
 16  WHERE
 17          s.time_id=t.time_id AND
 18          s.cust_id=cu.cust_id AND
 19          cu.country_id = co.country_id AND
 20          s.channel_id = ch.channel_id AND
 21          ch.channel_desc IN ('Direct Sales', 'Internet') AND
 22          t.calendar_month_desc IN ('2001-09', '2001-10') AND
 23          co.country_iso_code IN ('GB', 'US')
 24  GROUP BY
 25          ROLLUP(
 26                  ch.channel_desc,
 27                  t.calendar_month_desc,
 28                  co.country_iso_code)
 29  HAVING
 30          GROUPING_ID(
 31                  ch.channel_desc,
 32                  t.calendar_month_desc,
 33                  co.country_iso_code) = 7
 34  ;

CHANNEL_DESC CALENDAR CO SUM_AMOUNT_SOLD GROUPING_ID


                                  2117845.22           7

Do you see where this is going now? It's all well and good being able to generate multiple levels of aggregation in a query, but typically you only want one of them, and using CUBE, ROLLUP and GROUPING_ID allows you to zero-in on just the aggregation you are looking for, which is interesting if you consider the fact that with this vectoring now possible, you could potentially store a whole bunch of aggregates in one materialized view, retrieving the one you want at runtime by the use of this vector..

GROUPING SETS is another way of generating multiple aggregates, this time allowing you to generate sets of arbitrary aggregations (including aggregations created using ROLLUP, for example) with each set of aggregations then combined into a single result set using UNION ALL.

SQL> SELECT
  2     ch.channel_desc,
  3     t.calendar_month_desc,
  4     co.country_iso_code,
  5     SUM(s.amount_sold) sum_amount_sold
  6  FROM
  7     sales s,
  8     customers cu,
  9     times t,
 10     channels ch,
 11     countries co
 12  WHERE
 13     s.time_id=t.time_id AND
 14     s.cust_id=cu.cust_id AND
 15     cu.country_id = co.country_id AND
 16     s.channel_id = ch.channel_id AND
 17     ch.channel_desc IN ('Direct Sales', 'Internet') AND
 18     t.calendar_month_desc IN ('2001-09', '2001-10') AND
 19     co.country_iso_code IN ('GB', 'US')
 20  GROUP BY
 21     GROUPING SETS(
 22             (t.calendar_month_desc, co.country_iso_code),
 23             (ch.channel_desc, co.country_iso_code))
 24  ;

CHANNEL_DESC CALENDAR CO SUM_AMOUNT_SOLD


                 2001-09  GB       129671.77
                 2001-09  US        920819.9
                 2001-10  US       953046.35
                 2001-10  GB        114307.2

Direct Sales GB 168161.48
Internet GB 75817.49
Internet US 685948.51
Direct Sales US 1187917.74

8 rows selected.

So I think we've established that these extensions to the GROUP BY clause can be pretty nifty in terms of generating lots of aggregates, and lots of hierarchical aggregations, from a single SELECT statement. This is all well and good though if your query tool can generate SQL in this format, and apart from Discover which I seem to recall generating SQL using GROUPING SETS in the past, the vast majority of query tools are going to carry on generating their SQL using the good old GROUP BY clause. Is there any way, then, that you can use query rewrite to get these GROUP BYs to rewrite to use a materialized view created using some of these extensions? Well you can actually, as long as you use the GROUPING_ID function I mentioned earlier so that query rewrite can vector in on the required aggregate.

To show how this works, let's create a materialized view that uses the ROLLUP clause:

SQL> CREATE MATERIALIZED VIEW mv_sales_cube1
  2  ENABLE QUERY REWRITE AS
  3  SELECT
  4          ch.channel_total,
  5          ch.channel_class,
  6          co.country_region,
  7          co.country_iso_code,
  8          SUM(s.amount_sold) sum_amount_sold,
  9          GROUPING_ID(
 10                  ch.channel_total,
 11                  ch.channel_class,
 12                  co.country_region,
 13                  co.country_iso_code) grouping_id
 14  FROM
 15          sales s,
 16          customers cu,
 17          times t,
 18          channels ch,
 19          countries co
 20  WHERE
 21          s.time_id=t.time_id AND
 22          s.cust_id=cu.cust_id AND
 23          cu.country_id = co.country_id AND
 24          s.channel_id = ch.channel_id AND
 25          t.calendar_month_desc IN ('2001-09', '2001-10') AND
 26          co.country_iso_code IN ('GB', 'US')
 27  GROUP BY
 28          ROLLUP(
 29                  ch.channel_total,
 30                  ch.channel_class),
 31          ROLLUP(
 32                  co.country_region,
 33                  co.country_iso_code);

Materialized view created.

So this materialized view contains two sets of rolled-up aggregations that'll give me totals and subtotals across channels, channel classes, regions and countries, the benefit of this being that we've stored multiple levels of aggregation in a relational cube, similar to the way that multi-dimensional servers like Essbase store their data in a pre-aggregated form.

If we run a SELECT that uses a corresponding ROLLUP clause, the query rewrites as you would expect.

SQL> SELECT
  2          ch.channel_total,
  3          ch.channel_class,
  4          co.country_region,
  5          co.country_iso_code,
  6          SUM(s.amount_sold) sum_amount_sold
  7  FROM
  8          sales s,
  9          customers cu,
 10          times t,
 11          channels ch,
 12          countries co
 13  WHERE
 14          s.time_id=t.time_id AND
 15          s.cust_id=cu.cust_id AND
 16          cu.country_id = co.country_id AND
 17          s.channel_id = ch.channel_id AND
 18          t.calendar_month_desc IN ('2001-09', '2001-10') AND
 19          co.country_iso_code IN ('GB', 'US')
 20  GROUP BY
 21          ROLLUP(
 22                  ch.channel_total,
 23                  ch.channel_class),
 24          ROLLUP(
 25                  co.country_region,
 26                  co.country_iso_code);

CHANNEL_TOTAL CHANNEL_CLASS COUNTRY_REGION CO SUM_AMOUNT_SOLD


                               Europe               GB       321244.43
                               Europe                        321244.43
                               Americas             US      2603472.57
                               Americas                     2603472.57
                                                               2924717

Channel total Europe GB 321244.43
Channel total Europe 321244.43
Channel total Americas US 2603472.57
Channel total Americas 2603472.57
Channel total 2924717
Channel total Direct Europe GB 168161.48
Channel total Direct Europe 168161.48
Channel total Direct Americas US 1187917.74
Channel total Direct Americas 1187917.74
Channel total Direct 1356079.22
Channel total Others Europe GB 77265.46
Channel total Others Europe 77265.46
Channel total Others Americas US 729606.32
Channel total Others Americas 729606.32
Channel total Others 806871.78
Channel total Indirect Europe GB 75817.49
Channel total Indirect Europe 75817.49
Channel total Indirect Americas US 685948.51
Channel total Indirect Americas 685948.51
Channel total Indirect 761766

25 rows selected.

Elapsed: 00:00:00.12
SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor);

...

Plan hash value: 2249608591


| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

| 0 | SELECT STATEMENT | | | | 3 (100)| |
| 1 | MAT_VIEW REWRITE ACCESS FULL| MV_SALES_CUBE1 | 25 | 1225 | 3 (0)| 00:00:01 |

What's cool though is that if we then put together a regular SQL statement using just GROUP BY, this rewrites as well, as the GROUPING_ID I've stored in the materialized view allows query rewrite to retrieve the correct aggregation from the stored summary.

SQL> SELECT
  2          ch.channel_class,
  3          SUM(s.amount_sold) sum_amount_sold
  4  FROM
  5          sales s,
  6          customers cu,
  7          times t,
  8          channels ch,
  9          countries co
 10  WHERE
 11          s.time_id=t.time_id AND
 12          s.cust_id=cu.cust_id AND
 13          cu.country_id = co.country_id AND
 14          s.channel_id = ch.channel_id AND
 15          t.calendar_month_desc IN ('2001-09', '2001-10') AND
 16          co.country_iso_code IN ('GB', 'US')
 17  GROUP BY
 18          ch.channel_class;

CHANNEL_CLASS SUM_AMOUNT_SOLD


Direct 1356079.22
Others 806871.78
Indirect 761766

Elapsed: 00:00:00.23
SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor);

...

Plan hash value: 1714973320


| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

| 0 | SELECT STATEMENT | | | | 4 (100)| |
| 1 | HASH GROUP BY | | 3 | 114 | 4 (25)| 00:00:01 |
|* 2 | MAT_VIEW REWRITE ACCESS FULL| MV_SALES_CUBE1 | 3 | 114 | 3 (0)| 00:00:01 |

Predicate Information (identified by operation id):

2 - filter("MV_SALES_CUBE1"."GROUPING_ID"=3)

Not bad, and it answers the question that I've always had as to when we'd ever get to use these extensions to the GROUP BY clause in a real-world situation. If your query tool doesn't support these extensions to GROUP BY, you can add the GROUPING_ID function to your materialized view and regular GROUP BYs should rewrite to use the summary. If you're lucky enough to have a query tool that generates SQL using these extensions, then again through the use of GROUPING_ID and HAVING you can potentially extract all different levels of aggregation using these extensions. Again, as usual, add feedback to this posting if you've got any further tips to add to this.