Transcend and Index Maintenance

Before joining up with the guys across the pond, I ran my own consulting company based in Atlanta called Transcendent Data. With each new data warehouse I built, I realized I was engineering the same processes over and over again with each client. So I developed a framework of best-practices for ETL development, encompassing such things as auditing and logging, index and constraint maintenance, complex load scenarios, slowly-changing dimensions (SCD's), and other things. The product is called Transcend, and now we've decided to start offering it here at Rittman Mead, so I'll be blogging about some of the functionality from time-to-time. It's written entirely in PL/SQL, object-relational types, and Java stored procedures, so it installs completely within the database, supporting versions 10gR2 and forward.

Though the most requested aspect of Transcend is it's support for loading SCD's in set-based mode, including combinations of Type 1 and Type 2 attributes in the same table, this aspect depends on a lot of core features in the product, so it's prudent that I demonstrate some of these core features first, and then build up to the more advanced features in future postings. So the first thing I'd like to demonstrate is Transcend's support for index maintenance in an ETL context. I'll also use some of Transcend's other features in setting up the test case.

First, I'll build a table just like the SH.SALES table, but I'll create it in another schema. I'll include the partitioning information, the indexes, and all the rows. The combination of parameters below dictates which table properties to include in the cloning process:

SQL> BEGIN
  2     trans_etl.build_table(
  3                            p_table          => 'sales_fact',
  4                            p_owner          => 'target',
  5                            p_source_table   => 'sales',
  6                            p_source_owner   => 'sh',
  7                            p_tablespace     => 'users',
  8                            p_partitioning   => 'yes',
  9                            p_rows           => 'yes',
 10                            p_indexes        => 'yes',
 11                            p_constraints    => 'no',
 12                            p_statistics     => 'transfer'
 13                          );
 14  END;
 15  /
Table TARGET.SALES_FACT created
Number of records inserted into TARGET.SALES_FACT: 918843
Statistics from SH.SALES transfered to TARGET.SALES_FACT
Index SALES_FACT_CHANNEL_BIX built
Index SALES_FACT_CUST_BIX built
Index SALES_FACT_PROD_BIX built
Index SALES_FACT_PROMO_BIX built
Index SALES_FACT_TIME_BIX built
5 index creation processes executed for TARGET.SALES_FACT

PL/SQL procedure successfully completed.

Elapsed: 00:00:14.92
SQL>

I also need a version of the table without the rows, the indexes, or the partitioning information. Then, I'll insert only the rows from the SH.SALES table from the year 1998:

SQL> BEGIN
  2     trans_etl.build_table(
  3                            p_table          => 'sales_stg',
  4                            p_owner          => 'target',
  5                            p_source_table   => 'sales',
  6                            p_source_owner   => 'sh',
  7                            p_tablespace     => 'users',
  8                            p_partitioning   => 'no',
  9                            p_rows           => 'no',
 10                            p_indexes        => 'no',
 11                            p_constraints    => 'no',
 12                            p_statistics     => 'transfer'
 13                          );
 14  END;
 15  /
Table TARGET.SALES_STG created
Statistics from SH.SALES transfered to TARGET.SALES_STG

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.77
SQL> insert into target.sales_stg select * from sh.sales where to_char(time_id,'yyyy') = '1998';

178834 rows created.

Elapsed: 00:00:00.95
SQL> commit;

Commit complete.

Elapsed: 00:00:00.04
SQL>

With these two tables set up, I'll demonstrate some of the options Transcend provides for index maintenance. First is the ability to mark indexes on a particular table unusable by a variety of the attributes. First, I'll mark all bitmaps unusable, and then I'll rebuild them:

SQL> BEGIN
  2     trans_etl.unusable_indexes(
  3                                 p_table          => 'sales_fact',
  4                                 p_owner          => 'target',
  5                                 p_index_type     => 'bitmap'
  6                          );
  7  END;
  8  /
5 indexes and 0 local index partitions affected on table TARGET.SALES_FACT

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.07
SQL>
SQL> BEGIN
  2     trans_etl.usable_indexes(
  3                               p_table          => 'sales_fact',
  4                               p_owner          => 'target'
  5                          );
  6  END;
  7  /
Rebuild processes for unusable indexes on 28 partitions of table TARGET.SALES_FACT executed
No matching unusable global indexes found

PL/SQL procedure successfully completed.

Elapsed: 00:00:04.77
SQL>
SQL>

Now, I'll mark indexes unusable that match a particular regular expression:

SQL> BEGIN
  2     trans_etl.unusable_indexes(
  3                                 p_table          => 'sales_fact',
  4                                 p_owner          => 'target',
  5                                 p_index_regexp   => 'prod'
  6                          );
  7  END;
  8  /
1 index and 0 local index partitions affected on table TARGET.SALES_FACT

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.25
SQL>
SQL> BEGIN
  2     trans_etl.usable_indexes(
  3                               p_table          => 'sales_fact',
  4                               p_owner          => 'target'
  5                          );
  6  END;
  7  /
Rebuild processes for unusable indexes on 28 partitions of table TARGET.SALES_FACT executed
No matching unusable global indexes found

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.69
SQL>
SQL>

Now, I just want to mark bitmaps unusable for a particular partition: SALES_Q4_2003. I also want to change up the rebuild of the indexes. Instead of rebuilding the indexes one after another... I'd like to have them all rebuild at the same time. Thankfully, Transcend supports this, by sending the rebuild statements to the Oracle Scheduler, DBMS_SCHEDULER. The TRANS_ETL package will wait for the conclusion of all rebuild processes before continuing. All of this is done by simply passing a value of 'yes' to the P_CONCURRENT parameter.

SQL> BEGIN
  2     trans_etl.unusable_indexes(
  3                                 p_table          => 'sales_fact',
  4                                 p_owner          => 'target',
  5                                 p_partname       => 'sales_q4_2003'
  6                          );
  7  END;
  8  /
0 indexes and 5 local index partitions affected on table TARGET.SALES_FACT

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.67
SQL>
SQL> BEGIN
  2     trans_etl.usable_indexes(
  3                               p_table          => 'sales_fact',
  4                               p_owner          => 'target',
  5                               p_concurrent     => 'yes'
  6                          );
  7  END;
  8  /
Oracle scheduler job USABLE_INDEXES61 created
Oracle scheduler job USABLE_INDEXES61 enabled
Rebuild processes for unusable indexes on 1 partition of table TARGET.SALES_FACT submitted to the Oracle scheduler
No matching unusable global indexes found

PL/SQL procedure successfully completed.

Elapsed: 00:00:06.19
SQL>
SQL>

Finally, the most complicated bit in index maintenance... and the reason I built the staging table with only the rows from 1998 in it. When loading a fact table, we rarely want to affect all the local index partitions on that table. As a matter of fact, we usually want to mark unusable only a very small number of the local index partitions, and this is usually dependent on which rows we are loading into the fact table.

Transcend supports this notion by allowing the specification of a particular table or view, using P_SOURCE_OWNER and P_SOURCE_OBJECT, to determines which partitions to mark as unusable on the target table. Remember that the SALES_STG table contains only the rows from the SH.SALES table for 1998. First, I'll turn up the logging level for Transcend slightly so we can see the actual DDL being generated, and then I'll affect the local index partitions on SALES_FACT that correspond to the rows in SALES_STG:

SQL> BEGIN
  2
  3     trans_adm.set_module_conf(
  4                                p_logging_level=> 3
  5                              );
  6
  7  END;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SQL>
SQL> BEGIN
  2
  3     trans_etl.unusable_indexes(
  4                                 p_table          => 'sales_fact',
  5                                 p_owner          => 'target',
  6                                 p_source_object   => 'sales_stg',
  7                                 p_source_owner   => 'target'
  8                               );
  9
 10  END;
 11  /
SQL: alter index TARGET.SALES_FACT_CHANNEL_BIX modify partition SALES_Q1_1998 unusable
SQL: alter index TARGET.SALES_FACT_CHANNEL_BIX modify partition SALES_Q2_1998 unusable
SQL: alter index TARGET.SALES_FACT_CHANNEL_BIX modify partition SALES_Q3_1998 unusable
SQL: alter index TARGET.SALES_FACT_CHANNEL_BIX modify partition SALES_Q4_1998 unusable
SQL: alter index TARGET.SALES_FACT_CUST_BIX modify partition SALES_Q1_1998 unusable
SQL: alter index TARGET.SALES_FACT_CUST_BIX modify partition SALES_Q2_1998 unusable
SQL: alter index TARGET.SALES_FACT_CUST_BIX modify partition SALES_Q3_1998 unusable
SQL: alter index TARGET.SALES_FACT_CUST_BIX modify partition SALES_Q4_1998 unusable
SQL: alter index TARGET.SALES_FACT_PROD_BIX modify partition SALES_Q1_1998 unusable
SQL: alter index TARGET.SALES_FACT_PROD_BIX modify partition SALES_Q2_1998 unusable
SQL: alter index TARGET.SALES_FACT_PROD_BIX modify partition SALES_Q3_1998 unusable
SQL: alter index TARGET.SALES_FACT_PROD_BIX modify partition SALES_Q4_1998 unusable
SQL: alter index TARGET.SALES_FACT_PROMO_BIX modify partition SALES_Q1_1998 unusable
SQL: alter index TARGET.SALES_FACT_PROMO_BIX modify partition SALES_Q2_1998 unusable
SQL: alter index TARGET.SALES_FACT_PROMO_BIX modify partition SALES_Q3_1998 unusable
SQL: alter index TARGET.SALES_FACT_PROMO_BIX modify partition SALES_Q4_1998 unusable
SQL: alter index TARGET.SALES_FACT_TIME_BIX modify partition SALES_Q1_1998 unusable
SQL: alter index TARGET.SALES_FACT_TIME_BIX modify partition SALES_Q2_1998 unusable
SQL: alter index TARGET.SALES_FACT_TIME_BIX modify partition SALES_Q3_1998 unusable
SQL: alter index TARGET.SALES_FACT_TIME_BIX modify partition SALES_Q4_1998 unusable
0 indexes and 20 local index partitions affected on table TARGET.SALES_FACT

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.65
SQL>
SQL> BEGIN
  2
  3     trans_etl.usable_indexes(
  4                               p_table          => 'sales_fact',
  5                               p_owner          => 'target',
  6                               p_concurrent     => 'yes'
  7                          );
  8  END;
  9  /
SQL: alter table TARGET.SALES_FACT modify partition SALES_Q1_1998 rebuild unusable local indexes
Oracle scheduler job USABLE_INDEXES62 created
Oracle scheduler job USABLE_INDEXES62 enabled
SQL: alter table TARGET.SALES_FACT modify partition SALES_Q2_1998 rebuild unusable local indexes
Oracle scheduler job USABLE_INDEXES63 created
Oracle scheduler job USABLE_INDEXES63 enabled
SQL: alter table TARGET.SALES_FACT modify partition SALES_Q3_1998 rebuild unusable local indexes
Oracle scheduler job USABLE_INDEXES64 created
Oracle scheduler job USABLE_INDEXES64 enabled
SQL: alter table TARGET.SALES_FACT modify partition SALES_Q4_1998 rebuild unusable local indexes
Oracle scheduler job USABLE_INDEXES65 created
Oracle scheduler job USABLE_INDEXES65 enabled
Rebuild processes for unusable indexes on 4 partitions of table TARGET.SALES_FACT submitted to the Oracle scheduler
No matching unusable global indexes found

PL/SQL procedure successfully completed.

Elapsed: 00:00:05.88
SQL>

That's pretty much it for unusable and usable index functionality. Transcend also supports cloning indexes and dropping indexes, with many of the same parameters mentioned above.