Bitmap Index Examples

Between assignments (mostly, OBIEE and BI Applications work) and a family holiday in the Austrian Alps I have been helping Mark with some of the examples he will be demonstrating as part of his 11g Data Warehousing Seminar. I must confess that I, too, will be using the examples in some of my data warehouse courses.

One of the demonstrations I have worked on for is the use of BITMAP indexes in simple (single table) queries. For this demonstration I used the standard SH schema and a couple of the provided bitmap indexes on the CUSTOMERS table; that is the GENDER (CUSTOMERS_GENDER_BIX)  and YEAR OF BIRTH (CUSTOMERS_YOB_BIX) indexes.

Firstly, let's look at the data distributions:

select  cust_gender, count(*) from customers
group by cust_gender;

C   COUNT(*)
- ----------
F      18325
M      37175

There are about 66% M customers and 33% F, and no customers of any other (or NULL) gender.
A similar query shows us that there are 75 distinct years of birth (between 1913 and 1990) and for 1917 there are 37 customers

So what happens when we query the customers table on gender = M

SQL> explain plan for select * from customers where cust_gender = 'M';
select * from table(dbms_xplan.display);

Explained.
SQL>
PLAN_TABLE_OUTPUT
-------------------------------Plan hash value: 2008213504
-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           | 27750 |  4877K|   407   (1)
|*  1 |  TABLE ACCESS FULL| CUSTOMERS | 27750 |  4877K|   407   (1)
---------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("CUST_GENDER"='M')

We do a full table scan and do not use the index.

But when we choose a more selective query - the people born in 1917 (which hits about 0.07% of the table) we see this:

SQL> explain plan for select * from customers where
CUST_YEAR_OF_BIRTH = 1917;

select * from table(dbms_xplan.display);

Explained.

SQL>
PLAN_TABLE_OUTPUT
--------------------------Plan hash value: 3460183038
---------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)
---------------------------------------------------
|   0 | SELECT STATEMENT             |                   |   740 |   130K|   146   (0)
|   1 |  TABLE ACCESS BY INDEX ROWID | CUSTOMERS         |   740 |   130K|   146   (0)
|   2 |   BITMAP CONVERSION TO ROWIDS|                   |       |       |
|*  3 |    BITMAP INDEX SINGLE VALUE | CUSTOMERS_YOB_BIX |       |       |
---------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("CUST_YEAR_OF_BIRTH"=1917)

This time we are using the bitmap index.

So, is there any merit to using a bitmap index on a column with just two distinct values? After all, we do less work by table scanning the whole table.

The answer to this is yes. One of the great things about bitmap indexes is that they can be combined with other bitmap indexes on the same table by a bitwise 'AND' operation; that is we effectively create a multi-column index on-the-fly

SQL> explain plan for select * from customers where
CUST_YEAR_OF_BIRTH = 1917 and cust_gender = 'M';
select * from table(dbms_xplan.display);
Explained.
SQL>
PLAN_TABLE_OUTPUT
---------------------------------------------------Plan hash value: 4226934075
---------------------------------------------------
| Id  | Operation                    | Name                 | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------
|   0 | SELECT STATEMENT             |                      |   370 | 66600 |    82   (2)
|   1 |  TABLE ACCESS BY INDEX ROWID | CUSTOMERS            |   370 | 66600 |    82   (2)
|   2 |   BITMAP CONVERSION TO ROWIDS|                      |       |
|   3 |    BITMAP AND                |                      |       |
|*  4 |     BITMAP INDEX SINGLE VALUE| CUSTOMERS_YOB_BIX    |       |
|*  5 |     BITMAP INDEX SINGLE VALUE| CUSTOMERS_GENDER_BIX |       |       |           ---------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("CUST_YEAR_OF_BIRTH"=1917)
5 - access("CUST_GENDER"='M')

Here you can see both indexes being combined  to access a small subset of the customer table.

Hopefully, this has been a clear example of why a low cardinality column may not be a good bitmap index choice in isolation but when combined with other bitmap indexes can be very beneficial.