Using Data Pump and Transportable Tablespaces with OWB10gR2 “Paris”

April 10th, 2006 by Mark Rittman

One of the new features in Oracle Warehouse Builder 10gR2 "Paris" is
"Transportable Modules". Transportable Modules are source modules that get their
data from remote databases, with the data being brought into your warehouse
environment using either

transportable tablespaces
if your source and target databases are Oracle 9i
or higher, or

data pump
, if both source and target databases are 10g or higher. Note that
both source and target database have to be the same version (different releases
within the same version are OK though.)

The idea behind transportable modules is that they allow you to rapidly
replicate data between your source databases and warehouse environment, without
having to use flat files or manually set up FTP scripts. When you transport data
from an Oracle 9i schema, Oracle Warehouse Builder 10gR2 uses transportable
tablespaces to move your data, and when you’re on 10g or higher, you can either
still use transportable tablespaces, or you can use data pump, the successor to
imp/exp. So how does this work?

In this example, we’ve got a schema in a remote database called HRDATA that
contains a number of tables. We want to transport these tables into the
warehouse environment so that we can include them in the warehouse load. We
could just export them to flat files and copy them across, or even create a
database link, but in this case we want to replicate the schema into our
warehouse database and work on it from there. As both databases are 10g (the
source database is 10.1.0.2, the target is 10.2.0.1) we’ll use data pump to move
the data across.

Going into Oracle Warehouse Builder 10gR2, the first step then is to
right-click on the Transportable Modules node and select "New…"

Next, we name the module and provide a description:

The next step is to provide OWB with details on how to connect to the remote
database. The user that we connect as has to have the EXP_FULL_DATABASE and
ALTER_TABLESPACE roles – I connected as SYSTEM which had the former by default,
but I had to explicitly grant it the latter.

Note also the FTP username and password that it requires. The Transportable
Modules feature works by moving files around in the background via FTP, or at
least it does when you choose to move your data via transportable tablespaces -
when it uses data pump I expect it could just connect directly from source to
target, therefore not needing FTP, but I’m not sure if OWB implements it using
this quicker method. Also, when I first tried this, I connected as the schema
owner that contained the data, but as you have to grant EXP_FULL_DATABASE and
ALTER TABLESPACE, I thought it safer to leave the source schema with it’s
default permissions, and connect using SYSTEM instead.

Anyway, once you’ve entered the details for the source database, you can do
the same for the target. Again, I connected as SYSTEM here, as the transportable
module routine goes and creates the schema you’re going to load into anyway,
which will have the same name as the schema you’re loading from. Also, even
though I used the SYSTEM account, I had to grant a bunch of roles to the account
before I could use it as the target location:

SQL> grant connect to system;

Grant succeeded.

SQL> grant resource to system;

Grant succeeded.

SQL> grant imp_full_database to system;

Grant succeeded.

SQL> grant execute_catalog_role to system with admin option;

Grant succeeded.

SQL> grant create materialized view to system with admin option;

Grant succeeded.

SQL> grant alter tablespace to system;

Grant succeeded.

Note that the module isn’t going to load your source tables into the SYSTEM
schema, it’s just going to connect as SYSTEM when it brings data into your
target database; when you set the configuration options later on, you can define
the specific schema in your warehouse environment that you want to load data in
to.

Once you’ve set this up, the wizard looks like this:

Now that you’ve defined the source and target connections, the next step is
to pick the objects that you want to transport. The next dialog that comes up
gives you a listing of the tablespaces in the source database, and you can then
go and pick either a whole schema, or individual tables, to bring across. The
fact that your initial listing is of tablespaces is because the primary use for
this module is to facilitate transportable tablespaces, but it works just as
well if you’re going to use datapump, as we are.

Once you’ve completed all the details, OWB then goes away and
reads the source table metadata, so that you can work with it in your project.

At this point, there’s no source data in your target warehouse,
all you’ve done is define the sources and targets and brought across the table
metadata. The next step then is to deploy the module, but before you do that,
you need to set some configuration options, by right-clicking on the module
you’ve just created.

As we’re not going to use transportable tablespaces and use data
pump instead, I untick the "Transport Tablespaces" tickbox, and change "What To
Deploy" to "Tables Only" as I only want to bring across tables of data, not all
the other objects in the schema.

The second part to the configuration options lets you define
what local schema the replicated data will go into; in my case, I want to
redirect the data into a new schema called "HRDATA_SRC" which OWB will create
for me when I deploy the module.

You can of course specify a schema that’s already created, and
if I had a general "STAGING" schema I wanted to load data into, I could specify
it here. For what I’m doing though, I want to create a separate schema to hold
the replicated data, so I give it the name, password and default tablespace, set
the degree of parallelism for data pump, and press OK.

Now that the module has been defined, you can then view the
table metadata in the Design Center, like this:

… then finally, use the Control Center Manager to deploy the
module and copy the tables across.

Now, when I connect to the warehouse environment, I can use
SQL*Plus to check that the data is there.

SQL> conn src_hrdata/password@ora10g
Connected.
SQL> select table_name
  2  from   user_tables
  3  /

TABLE_NAME
------------------------------
EMPLOYEES
JOBS
JOB_HISTORY
COUNTRIES
REGIONS
DEPARTMENTS
LOCATIONS

7 rows selected.

SQL> select *
  2  from   job_history
  3  /

EMPLOYEE_ID START_DAT END_DATE  JOB_ID                         DEPARTMENT_ID
----------- --------- --------- ------------------------------ -------------
        102 13-JAN-93 24-JUL-98 IT_PROG                                   60
        101 21-SEP-89 27-OCT-93 AC_ACCOUNT                               110
        101 28-OCT-93 15-MAR-97 AC_MGR                                   110
        201 17-FEB-96 19-DEC-99 MK_REP                                    20
        114 24-MAR-98 31-DEC-99 ST_CLERK                                  50
        122 01-JAN-99 31-DEC-99 ST_CLERK                                  50
        200 17-SEP-87 17-JUN-93 AD_ASST                                   90
        176 24-MAR-98 31-DEC-98 SA_REP                                    80
        176 01-JAN-99 31-DEC-99 SA_MAN                                    80
        200 01-JUL-94 31-DEC-98 AC_ACCOUNT                                90

10 rows selected.

Just to recap then, the Transportable Module in OWB "Paris" lets
you bring in to your warehouse database, either via transportable tablespaces or
data pump, data from a remote Oracle database. The reason you’d use this, as
opposed to connecting via a database link, is that transportable tablespaces and
data pump are far quicker at bringing across large amount of data, and the data
then sits in your warehouse database, rather than you having to keep going back
and forth to the remote database.

In a few days time, I’m going to follow this up by taking a look at setting up Change Data
Capture with OWB "Paris".

Comments

  1. Nicholas Goodman Says:

    I know I’m not really “doing the Oracle thing” anymore, but ahhhh heck.
    I had kicked the tires extensively on this feature. “ETL at the speed of copy” holds a lot of promise and is just really really cool. I also liked how it could switch between Data Pump and Transportable Tablespaces.
    At one point I started drafting an article to submit to OTN until … I hit what I believe is just a bizarre stipulation.
    You can not “EXECUTE” the refresh process from OWB, Process Flows, etc. The ONLY WAY to “re-transport” the module is by REDEPLOYING it. Errrr….
    Setting up the Transportable Module is like setting up a mapping (SOURCE TAB -> DBLINK -> STAGE TAB). Now, once you have the metadata setup to REFRESH THE DATA you should just be able to EXECUTE the object. That is NOT the case with Trans Modules unfortunately. The only time data moves from source to warehouse is when you are using OMB or Control center to deploy your objects.
    In other words, if Oracle had designed mappings like they did Transportable modules you’d have to redeploy your mapping everytime you wanted to wanted to move data.
    Oracle can improve on this. Make Transportable Modules executable so that one can “execute the refresh.” Until then, someone has to sit at the Control Center at 2am to move it (or write OMBPlus).

  2. Andrew Gordon Says:

    Very nice post. Thanks.
    In our case we are moving data from Oracle OLTP into SybaseIQ. And we also wanted to avoid using flat files. We used our own scripts in Pro*C first but it was too slow for us. Our architect got Fastreader wisdomforce tool which allowed us extract data into the pipe and then other process readed from the pipe and pushed data into Sybase IQ. The strong feature of fastreader is extreme high speed while data extracted from Oracle into text csv format that SybaseIQ could read. DataPump is fast too, but data is extracted into binary only.

  3. Paul C Says:

    Wonder if anyone ou there can help ;-
    I’m trying this guide step by step, all passwords eneterd, dblinks precreated, but I get this error when I deploy

    None
    INFORMATIONAL
    ### find or create a useable dblink to source.
    None
    INFORMATIONAL
    ### step 11 begin: importing tts …
    None
    INFORMATIONAL
    ### step10 end: skipped.
    None
    INFORMATIONAL
    ### failed dblink creation.
    TARGET_MODULE
    Create Error RPE-01036: Failed in creating a database link to source database.

    any help appreciated
    PC

Website Design & Build: tymedia.co.uk