So How Do Star Transformations Actually Work?

Although like most Oracle data warehousing developers, I'm aware of what a star transformation is, and I can usually spot one in an execution plan. Earlier in December though I was on a customer engagement and one of the client's developers asked some deceptively simple questions about star transformations, which on the face of it I should have known but in reality made me think a bit. The developer had a web and transactional background, knew about join types such as nested loops and hash joins, but was wondering:

  • Firstly, if with nested loop joins, Oracle starts at the smallest dimension table, joins it to the fact table, and then progressively joins the results of this to the other dimension tables, why is a star transformation, that joins all dimension tables to the fact table at once, more efficient?
  • Secondly, what exactly goes on during a star transformation?;
  • Thirdly, how do you interpret a star transformation execution plan?,
  • And fourthly, when did Oracle star adding messages to execution plans to tell you a star transformation took place?
A good set of questions, and to put some answers together I've dusted down the SH Sales History schema to put some examples together.

To start off, I create a copy of the SH.CUSTOMERS and SH.PRODUCTS tables, and two copies of the SH.SALES table. I then add primary key constraints to the two dimension tables. foreign key constraints to the fact tables, and then create bitmap indexes on one of the fact tables followed by b-tree indexes on the other fact table. Finally, I gather statistics on the tables and indexes so that I'm then ready to start the tests. The version of Oracle is 11.1.0.6 on Windows XP SP2.

SQL> set echo on
SQL> set autotrace traceonly
SQL> set pages 400
SQL> set lines 500

SQL> -- first of all, create copies of the customers, products and sales tables

SQL> -- with indexes and constraints applied as required

SQL> drop table sales_star;

Table dropped.

SQL> drop table customers_star;

Table dropped.

SQL> drop table products_star;

Table dropped.

SQL> -- first, the tables

SQL> create table sales_star
  2  as
  3  select * from sh.sales;
Table created.

SQL> create table customers_star
  2  as
  3  select * from sh.customers;
Table created.

SQL> create table products_star
  2  as
  3  select * from sh.products;
Table created.

SQL> -- then the required constraints

SQL> alter table customers_star add constraint cust_star_pk primary key (cust_id);
Table altered.

SQL> alter table products_star add constraint prod_star_pk primary key (prod_id);
Table altered.

SQL> alter table sales_star add constraint sales_prod_star_fk foreign key (prod_id) references products_star;
Table altered.

SQL> alter table sales_star add constraint sales_cust_star_fk foreign key (cust_id) references customers_star;
Table altered.

SQL> -- then the bitmap indexes on the fact table foreign key columns

SQL> create bitmap index sales_star_cust_bix on sales_star(cust_id);
Index created.

SQL> create bitmap index sales_star_prod_bix on sales_star(prod_id);
Index created.

SQL> -- then the bitmap indexes on the dimension attributes

SQL> create bitmap index customers_star_gender_bix on customers_star(cust_gender);
Index created.

SQL> create bitmap index customers_star_city_bix on customers_star(cust_city);
Index created.

SQL> create bitmap index products_star_subcategory_bix on products_star(prod_subcategory_desc);
Index created.

SQL> -- finally, gather table and index statistics

SQL> analyze table sales_star compute statistics for table for all indexes for all indexed columns;
Table analyzed.

SQL> analyze table customers_star compute statistics for table for all indexes for all indexed columns;
Table analyzed.

SQL> analyze table products_star compute statistics for table for all indexes for all indexed columns;
Table analyzed.
Ok, so we're ready to roll. If I first run an ALTER SESSION command to turn off star transformations, and I then run a typical star query against the fact table with regular b-tree indexes on the dimension key columns, my execution plan looks like this:
SQL> alter session set star_transformation_enabled='false';

Session altered.

SQL> -- run a sample query, with star transformations disabled for the b-tree indexed fact table

SQL> select sum(quantity_sold), p.prod_subcategory_desc, c.cust_gender
  2  from   sales_no_star s, products_star p, customers_star c
  3  where  s.prod_id = p.prod_id
  4  and    s.cust_id = c.cust_id
  5  and    p.prod_subcategory_desc = 'Memory'
  6  and    c.cust_city = 'Oxford'
  7  and    c.cust_gender = 'F'
  8  group by p.prod_subcategory_desc, c.cust_gender;

Execution Plan
----------------------------------------------------------

Plan hash value: 3223803986
------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                               |     1 |    50 |   282   (1)| 00:00:04 |
|   1 |  SORT GROUP BY NOSORT            |                               |     1 |    50 |   282   (1)| 00:00:04 |
|*  2 |   HASH JOIN                      |                               |   104 |  5200 |   282   (1)| 00:00:04 |
|   3 |    TABLE ACCESS BY INDEX ROWID   | CUSTOMERS_STAR                |    29 |   406 |     9   (0)| 00:00:01 |
|   4 |     BITMAP CONVERSION TO ROWIDS  |                               |       |       |            |          |
|   5 |      BITMAP AND                  |                               |       |       |            |          |
|*  6 |       BITMAP INDEX SINGLE VALUE  | CUSTOMERS_STAR_CITY_BIX       |       |       |            |          |
|*  7 |       BITMAP INDEX SINGLE VALUE  | CUSTOMERS_STAR_GENDER_BIX     |       |       |            |          |
|   8 |    NESTED LOOPS                  |                               |       |       |            |          |
|   9 |     NESTED LOOPS                 |                               | 25523 |   897K|   272   (0)| 00:00:04 |
|  10 |      TABLE ACCESS BY INDEX ROWID | PRODUCTS_STAR                 |     2 |    32 |     2   (0)| 00:00:01 |
|  11 |       BITMAP CONVERSION TO ROWIDS|                               |       |       |            |          |
|* 12 |        BITMAP INDEX SINGLE VALUE | PRODUCTS_STAR_SUBCATEGORY_BIX |       |       |            |          |
|* 13 |      INDEX RANGE SCAN            | SALES_NO_STAR_PROD_BIX        | 12762 |       |    26   (0)| 00:00:01 |
|  14 |     TABLE ACCESS BY INDEX ROWID  | SALES_NO_STAR                 | 12762 |   249K|   135   (0)| 00:00:02 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("S"."CUST_ID"="C"."CUST_ID")
   6 - access("C"."CUST_CITY"='Oxford')
   7 - access("C"."CUST_GENDER"='F')
  12 - access("P"."PROD_SUBCATEGORY_DESC"='Memory')
  13 - access("S"."PROD_ID"="P"."PROD_ID")

Statistics
----------------------------------------------------------

        139  recursive calls
          0  db block gets
        511  consistent gets
          0  physical reads
          0  redo size
        569  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
So if we read through the execution plan, starting at the right-most, top-most line (line 10), we can see the following steps happening:
  • The PRODUCTS_STAR table is accessed via a bitmap index on the subcategory column, with two rows being estimated as being returned (steps 10,11 and 12)
  • The filtered rows from the PRODUCTS_STAR table are then nested loop joined to the SALES_NO_STAR table, with the PRODUCTS_STAR table driving (steps 8,9, 13 and 14)
  • Around 25,000 rows are returned from this join and are then hash joined to the CUSTOMERS_STAR table (steps 2 to 7), returning around 211 rows
  • This data is then aggregated to give us the final results.
So the overall costs of this execution plan is around 282, around 897k or so of data is being accessed, two joins take place with the first one joining a two row dataset with a 12000 row dataset to return 25000 rows, with the second one joining this 25000 row dataset to a 58 row dataset. So that's a lot of rows moving around.

A variation on this is when we still have star transformations disabled, but we change the b-tree indexes on the fact table to bitmap indexes, in preparation for running a full star transformation. In this case, the execution plan looks a little different, it's still not a star transformation but what it looks like is something not unlike an old-fashioned star join, where the two dimension tables are cartesian joined together before the results are then nested loop joined to the fact table.

SQL> -- now run it for the bitmap indexed fact table

SQL> select sum(quantity_sold), p.prod_subcategory_desc, c.cust_gender
  2  from   sales_star s, products_star p, customers_star c
  3  where  s.prod_id = p.prod_id
  4  and    s.cust_id = c.cust_id
  5  and    p.prod_subcategory_desc = 'Memory'
  6  and    c.cust_city = 'Oxford'
  7  and    c.cust_gender = 'F'
  8  group by p.prod_subcategory_desc, c.cust_gender;

Execution Plan
----------------------------------------------------------

Plan hash value: 3116684012
-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                               |     1 |    50 |   205   (0)| 00:00:03 |
|   1 |  SORT GROUP BY NOSORT             |                               |     1 |    50 |   205   (0)| 00:00:03 |
|   2 |   NESTED LOOPS                    |                               |       |       |            |          |
|   3 |    NESTED LOOPS                   |                               |   104 |  5200 |   205   (0)| 00:00:03 |
|   4 |     MERGE JOIN CARTESIAN          |                               |    58 |  1740 |    23   (5)| 00:00:01 |
|   5 |      TABLE ACCESS BY INDEX ROWID  | PRODUCTS_STAR                 |     2 |    32 |     2   (0)| 00:00:01 |
|   6 |       BITMAP CONVERSION TO ROWIDS |                               |       |       |            |          |
|*  7 |        BITMAP INDEX SINGLE VALUE  | PRODUCTS_STAR_SUBCATEGORY_BIX |       |       |            |          |
|   8 |      BUFFER SORT                  |                               |    29 |   406 |    21   (0)| 00:00:01 |
|   9 |       TABLE ACCESS BY INDEX ROWID | CUSTOMERS_STAR                |    29 |   406 |    23   (5)| 00:00:01 |
|  10 |        BITMAP CONVERSION TO ROWIDS|                               |       |       |            |          |
|  11 |         BITMAP AND                |                               |       |       |            |          |
|* 12 |          BITMAP INDEX SINGLE VALUE| CUSTOMERS_STAR_CITY_BIX       |       |       |            |          |
|* 13 |          BITMAP INDEX SINGLE VALUE| CUSTOMERS_STAR_GENDER_BIX     |       |       |            |          |
|  14 |     BITMAP CONVERSION TO ROWIDS   |                               |       |       |            |          |
|  15 |      BITMAP AND                   |                               |       |       |            |          |
|* 16 |       BITMAP INDEX SINGLE VALUE   | SALES_STAR_CUST_BIX           |       |       |            |          |
|* 17 |       BITMAP INDEX SINGLE VALUE   | SALES_STAR_PROD_BIX           |       |       |            |          |
|  18 |    TABLE ACCESS BY INDEX ROWID    | SALES_STAR                    |     2 |    40 |   205   (0)| 00:00:03 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   7 - access("P"."PROD_SUBCATEGORY_DESC"='Memory')
  12 - access("C"."CUST_CITY"='Oxford')
  13 - access("C"."CUST_GENDER"='F')
  16 - access("S"."CUST_ID"="C"."CUST_ID")
  17 - access("S"."PROD_ID"="P"."PROD_ID")

Statistics
----------------------------------------------------------

        139  recursive calls
          0  db block gets
        353  consistent gets
         36  physical reads
          0  redo size
        569  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed
Now if you look at the rows, bytes and cost columns in the execution plan, the cost of the plan is about 39% lower than the previous one, with the amount of bytes being scanned looking considerably less than the previous example. Step 4 in the execution plan shows a MERGE JOIN CARTESIAN that to me looks like it's cartesian joining the two dimension tables. Turning to Page 258 of Jonathan Lewis' Cost-Based Oracle Fundamentals I can see this is the old behavior of Oracle before star transformations were available, and could well be used when multi-column indexes (or apparently, multiple bitmap indexes) are available on a fact table and the cartesian product of the dimension tables isn't too large. Whichever way, it's one other way of combining fact and dimension tables when star transformations aren't available, and it could well be what you see if you've set things up for star transformations but not actually turned them on (see this other article from the blog where we can also see them occurring), but they're not really what we're after.

Now if we turn on star transformations, here's the execution plan that we'll get:

SQL> -- now run the same query again, this time with star transformations enabled

SQL> alter session set star_transformation_enabled='true';
Session altered.

SQL> select sum(quantity_sold), p.prod_subcategory_desc, c.cust_gender
  2  from   sales_star s, products_star p, customers_star c
  3  where  s.prod_id = p.prod_id
  4  and    s.cust_id = c.cust_id
  5  and    p.prod_subcategory_desc = 'Memory'
  6  and    c.cust_city = 'Oxford'
  7  and    c.cust_gender = 'F'
  8  group by p.prod_subcategory_desc, c.cust_gender;

Execution Plan
----------------------------------------------------------
Plan hash value: 2728756496

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                               |     1 |    41 |    27   (4)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION            |                               |       |       |            |          |
|   2 |   LOAD AS SELECT                      | SYS_TEMP_0FD9D660C_56C9C5     |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID        | CUSTOMERS_STAR                |    29 |   406 |     9   (0)| 00:00:01 |
|   4 |     BITMAP CONVERSION TO ROWIDS       |                               |       |       |            |          |
|   5 |      BITMAP AND                       |                               |       |       |            |          |
|*  6 |       BITMAP INDEX SINGLE VALUE       | CUSTOMERS_STAR_CITY_BIX       |       |       |            |          |
|*  7 |       BITMAP INDEX SINGLE VALUE       | CUSTOMERS_STAR_GENDER_BIX     |       |       |            |          |
|   8 |   HASH GROUP BY                       |                               |     1 |    41 |    18   (6)| 00:00:01 |
|*  9 |    HASH JOIN                          |                               |     1 |    41 |    18   (6)| 00:00:01 |
|* 10 |     HASH JOIN                         |                               |     1 |    36 |    15   (0)| 00:00:01 |
|  11 |      TABLE ACCESS BY INDEX ROWID      | PRODUCTS_STAR                 |     2 |    32 |     2   (0)| 00:00:01 |
|  12 |       BITMAP CONVERSION TO ROWIDS     |                               |       |       |            |          |
|* 13 |        BITMAP INDEX SINGLE VALUE      | PRODUCTS_STAR_SUBCATEGORY_BIX |       |       |            |          |
|  14 |      TABLE ACCESS BY INDEX ROWID      | SALES_STAR                    |    13 |   260 |    13   (0)| 00:00:01 |
|  15 |       BITMAP CONVERSION TO ROWIDS     |                               |       |       |            |          |
|  16 |        BITMAP AND                     |                               |       |       |            |          |
|  17 |         BITMAP MERGE                  |                               |       |       |            |          |
|  18 |          BITMAP KEY ITERATION         |                               |       |       |            |          |
|  19 |           TABLE ACCESS FULL           | SYS_TEMP_0FD9D660C_56C9C5     |     1 |    13 |     2   (0)| 00:00:01 |
|* 20 |           BITMAP INDEX RANGE SCAN     | SALES_STAR_CUST_BIX           |       |       |            |          |
|  21 |         BITMAP MERGE                  |                               |       |       |            |          |
|  22 |          BITMAP KEY ITERATION         |                               |       |       |            |          |
|  23 |           TABLE ACCESS BY INDEX ROWID | PRODUCTS_STAR                 |     2 |    32 |     2   (0)| 00:00:01 |
|  24 |            BITMAP CONVERSION TO ROWIDS|                               |       |       |            |          |
|* 25 |             BITMAP INDEX SINGLE VALUE | PRODUCTS_STAR_SUBCATEGORY_BIX |       |       |            |          |
|* 26 |           BITMAP INDEX RANGE SCAN     | SALES_STAR_PROD_BIX           |       |       |            |          |
|  27 |     TABLE ACCESS FULL                 | SYS_TEMP_0FD9D660C_56C9C5     |    29 |   145 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("C"."CUST_CITY"='Oxford')
   7 - access("C"."CUST_GENDER"='F')
   9 - access("S"."CUST_ID"="C0")
  10 - access("S"."PROD_ID"="P"."PROD_ID")
  13 - access("P"."PROD_SUBCATEGORY_DESC"='Memory')
  20 - access("S"."CUST_ID"="C0")
  25 - access("P"."PROD_SUBCATEGORY_DESC"='Memory')
  26 - access("S"."PROD_ID"="P"."PROD_ID")

Note
-----
   - star transformation used for this statement

Statistics
----------------------------------------------------------
        127  recursive calls
         10  db block gets
        221  consistent gets
          1  physical reads
       1464  redo size
        569  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
So what's happening here then? Well for a start, the cost of the plan has gone down to 41, from the 205 and 282 shown by the previous two execution plans. The largest number of rows we ever process is 29, with a maximum of 406 bytes of data compared to 897k of data in on of the other execution plans. Just looking at this plan compared to the other two, it certainly looks a lot more efficient to the other two, but what exactly is going on when the star transformation happens? Well, one complicating factor in the execution plan is the presence of a temporary table, so I'll set the star transformation parameter to "temp_disable" to remove this, and the execution plan should be a bit easier to read:
SQL> -- show the execution plan with temporary table disabled

SQL> alter session set star_transformation_enabled='temp_disable';
Session altered.

SQL> select sum(quantity_sold), p.prod_subcategory_desc, c.cust_gender
  2  from   sales_star s, products_star p, customers_star c
  3  where  s.prod_id = p.prod_id
  4  and    s.cust_id = c.cust_id
  5  and    p.prod_subcategory_desc = 'Memory'
  6  and    c.cust_city = 'Oxford'
  7  and    c.cust_gender = 'F'
  8  group by p.prod_subcategory_desc, c.cust_gender;

Execution Plan
----------------------------------------------------------
Plan hash value: 2728442436

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                               |     1 |    50 |    25   (4)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT                |                               |     1 |    50 |    25   (4)| 00:00:01 |
|*  2 |   HASH JOIN                          |                               |     1 |    50 |    25   (4)| 00:00:01 |
|*  3 |    HASH JOIN                         |                               |     1 |    36 |    15   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID      | PRODUCTS_STAR                 |     2 |    32 |     2   (0)| 00:00:01 |
|   5 |      BITMAP CONVERSION TO ROWIDS     |                               |       |       |            |          |
|*  6 |       BITMAP INDEX SINGLE VALUE      | PRODUCTS_STAR_SUBCATEGORY_BIX |       |       |            |          |
|   7 |     TABLE ACCESS BY INDEX ROWID      | SALES_STAR                    |    13 |   260 |    13   (0)| 00:00:01 |
|   8 |      BITMAP CONVERSION TO ROWIDS     |                               |       |       |            |          |
|   9 |       BITMAP AND                     |                               |       |       |            |          |
|  10 |        BITMAP MERGE                  |                               |       |       |            |          |
|  11 |         BITMAP KEY ITERATION         |                               |       |       |            |          |
|  12 |          TABLE ACCESS BY INDEX ROWID | CUSTOMERS_STAR                |    29 |   406 |     9   (0)| 00:00:01 |
|  13 |           BITMAP CONVERSION TO ROWIDS|                               |       |       |            |          |
|  14 |            BITMAP AND                |                               |       |       |            |          |
|* 15 |             BITMAP INDEX SINGLE VALUE| CUSTOMERS_STAR_CITY_BIX       |       |       |            |          |
|* 16 |             BITMAP INDEX SINGLE VALUE| CUSTOMERS_STAR_GENDER_BIX     |       |       |            |          |
|* 17 |          BITMAP INDEX RANGE SCAN     | SALES_STAR_CUST_BIX           |       |       |            |          |
|  18 |        BITMAP MERGE                  |                               |       |       |            |          |
|  19 |         BITMAP KEY ITERATION         |                               |       |       |            |          |
|  20 |          TABLE ACCESS BY INDEX ROWID | PRODUCTS_STAR                 |     2 |    32 |     2   (0)| 00:00:01 |
|  21 |           BITMAP CONVERSION TO ROWIDS|                               |       |       |            |          |
|* 22 |            BITMAP INDEX SINGLE VALUE | PRODUCTS_STAR_SUBCATEGORY_BIX |       |       |            |          |
|* 23 |          BITMAP INDEX RANGE SCAN     | SALES_STAR_PROD_BIX           |       |       |            |          |
|  24 |    TABLE ACCESS BY INDEX ROWID       | CUSTOMERS_STAR                |    29 |   406 |     9   (0)| 00:00:01 |
|  25 |     BITMAP CONVERSION TO ROWIDS      |                               |       |       |            |          |
|  26 |      BITMAP AND                      |                               |       |       |            |          |
|* 27 |       BITMAP INDEX SINGLE VALUE      | CUSTOMERS_STAR_CITY_BIX       |       |       |            |          |
|* 28 |       BITMAP INDEX SINGLE VALUE      | CUSTOMERS_STAR_GENDER_BIX     |       |       |            |          |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("S"."CUST_ID"="C"."CUST_ID")
   3 - access("S"."PROD_ID"="P"."PROD_ID")
   6 - access("P"."PROD_SUBCATEGORY_DESC"='Memory')
  15 - access("C"."CUST_CITY"='Oxford')
  16 - access("C"."CUST_GENDER"='F')
  17 - access("S"."CUST_ID"="C"."CUST_ID")
  22 - access("P"."PROD_SUBCATEGORY_DESC"='Memory')
  23 - access("S"."PROD_ID"="P"."PROD_ID")
  27 - access("C"."CUST_CITY"='Oxford')
  28 - access("C"."CUST_GENDER"='F')

Note
-----
   - star transformation used for this statement

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        276  consistent gets
          0  physical reads
          0  redo size
        569  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
OK, so that's a bit clearer. So what's going on then? Well, the first thing to understand about a star transformation is that it's not a new kind of join, it's more of a join strategy. When Oracle is allowed to consider star transformations and your tables are set up correctly (bitmap indexes on the fact table dimension key columns, constraints declared between the tables, cost-based optimizer used and so on), what happens is that your query is rewritten from, in my example, something like this:
select sum(quantity_sold), p.prod_subcategory_desc, c.cust_gender
from   sales_star s, products_star p, customers_star c
where  s.prod_id = p.prod_id
and    s.cust_id = c.cust_id
and    p.prod_subcategory_desc = 'Memory'
and    c.cust_city = 'Oxford'
and    c.cust_gender = 'F'
group by p.prod_subcategory_desc, c.cust_gender;
to this:
select sum(quantity_sold), p.prod_subcategory_desc, c.cust_gender
from   sales_star s, products_star p, customers_star c
where  s.prod_id in (select pr.prod_id
                     from products pr
                     where pr.prod_subcategory_desc = 'Memory')
and    s.cust_id in (select cu.cust_id
                     from customers cu
                     where cu.cust_city = 'Oxford'
                     and cu.cust_gender = 'F')
and    s.prod_id = p.prod_id
and    s.cust_id = c.cust_id
group by p.prod_subcategory_desc, c.cust_gender;
Now this functionally returns the same data as the previous version of the query, but crucially it allows the cost-based optimizer to execute the query in the following way:
  1. Firstly, the dimension key values from the dimension tables are located for the filters specified, and these are then used to locate the relevant fact table rows (per dimension) using bitmap indexes on the fact table dimension columns. In the fourth execution plan, this happens in steps 10 through to 23.

Startransform1

2. Secondly, these fact table bitmap index references are then logically ANDed to locate the particular fact table rows that satisfy all dimension filters. This happens in steps 8 and 9 of the fourth execution plan.

Startransform2

3. Next, the remaining fact table rows are joined back, using a hash join, to the dimension tables in order to retrieve the requested dimension column values. This happens in steps 2 through to 6, and 24 through to 28 in the fourth execution plan.

Startransform3

4. Finally, the returned data is grouped and aggregated in step 1 of the fourth execution plan.

Startransform4

The third execution plan, generated when star_transformation_enabled = "true", uses a global temporary table to help part of the plan execution, leading to a 20% reduction again in the cost of the plan. As this approach writes the temporary table to disk you might want to disable this if disk I/O is an issue.

To my mind, the crucial thing to be aware of in the star transformation join strategy is the way that the bitmap indexes on the fact table are used. By creating these subqueries in the rewriten version of the SQL statement, Oracle can identify which rows in the individual fact table bitmap indexes are of interest, and the nature of bitmap indexes means that they can then be logically ANDed, OR'd and so on very quickly to identify exactly which fact table rows are of interest to us, just Oracle does when it filters a dimension table on more than one bitmap indexed column. Once the fact table has been filtered in this way it's then much cheaper to join the result to the individual dimension tables, and this is why a star transformation can be much more efficient than a series of nested loop joins between the fact table and the dimension tables - you end up filtering the fact table much earlier on and therefore end up shifting much less data around to answer your query.

So, to me that answers the questions "what does a star transformation actually do?" (it rewrites your query so that the fact table can be filtered using a logical AND of the bitmap indexed dimension key columns), and it also explains what's going on in a star transformation explain plan compared to a regular explain plan. One question it didn't answer though was when Oracle started adding the "star transformation used for this statement" message to explain plans - looking back at my previous examples, I started seeing it added to explain plans back from Oracle 10gR1, I'm not sure if it appeared in 9iR2 or earlier. I'm also not sure how accurate it is - I'm sure I've seen explain plans in 10gR2 that looked very much like a star transformation had taken place, but this message didn't appear as part of the plan output (I used DBMS_XPLAN.DISPLAY if I remember correctly). If anyone knows any more about this feature and when it started appearing, then let me know.