High quality Business Intelligence is key in decision making for any successful organisation and for an increasing number of businesses this means being able to access real-time data. At Rittman Mead we are seeing a big upturn in interest in technologies that will help our customers to integrate real-time data into their BI systems. In a world where streaming is revolutionising the way we consume data, log-based Change Data Capture (CDC) comes into the picture as a way of providing access to real-time transactional data for reporting and analysis. Most databases support CDC nowadays; in Oracle for example the redo logs are a potential source of CDC data. Integration tools that support CDC are coming to the fore and one of these is HVR.
HVR is a real-time data integration solution that replicates data to and from many different data management technologies, both on-premise and cloud-based, including Oracle, SAP and Amazon. The HVR GUI makes it easy to connect together multiple databases, replicate entire schemas, initialise the source and target systems and then keep them in sync.
HVR have recently added Oracle’s Autonomous Data Warehouse (ADW) to their list of supported technologies and so in this blog we are stepping through the process of configuring an HVR Channel to replicate data from an Oracle database to an instance of Oracle ADW.
Before setting up replication you have to install HVR itself. This is simple enough, a fairly manual CLI job with a couple of files to create and save in the correct directories. Firewalls also needs to allow all HVR connections. HVR needs a database schema in which to store the repository configuration data and so we created a schema in the source Oracle database. It also needs some additional access rights on the Oracle source database.
The first step is to register the newly created Hub in the HVR GUI. The GUI can be run on any machine that is able to connect to the server on which the HVR hub is installed. We tested two GUI instances, one running on a Windows machine and one on a MAC. Both were easy to install and configure.
The database connection details entered here are for the HVR hub database, where metadata about the hub configuration is stored.
Next we need to define our source and target. In both cases the connection between the HVR and the data uses standard Oracle database connectivity. The source connection is to a database on the same server as the HVR hub and the target connection uses a TNS connection pointing at the remote ADW instance.
Defining the source database involves right clicking on Location Configuration and selecting New Location:
Configuring the target involves the same steps:
You can see from the screenshot that we are using one of the Oracle-supplied tnsnames entries to connect to ADW and also that we are using a separate Oracle client install to connect to ADW. Some actions within HVR use the Oracle Call Interface and require a more recent version of the Oracle client than provided by our 12c database server install.
Next up is creating the “channel”. A channel channel in HVR groups together the source and target locations and allows the relationship between the two to be defined and maintained. Configuring a new channel involves naming it, defining source and target locations and then identifying the tables in the source that contain the data to be replicated.
The channel name is defined by right clicking on Channel Definitions and selecting New Channel.
We then open the new channel and right click on Location Groups and select New Group to configure the group to contain source locations:
The source location is the location we defined in step 2 above. We then right click on the newly created group and select New Action, Capture to define the role of the group in the channel:
The Capture action defines that data will be read from the locations in this group.
A second Location Group is needed for the for the target. This time we defined the target group to have the Integrate action so that data will be written to the locations in this group.
The final step in defining the channel is to identify the tables we want to replicate. This can be done using the Table Explore menu option when you right-click on Tables.:
With the channel defined we can start synchronising the data between the two systems. We are starting with an empty database schema in our ADW target so we use the HVR Refresh action to first create the target tables in ADW and to populate them with the current contents of the source tables. As the Refresh action proceeds we can monitor progress:
Now with the two systems in sync we can start the process of real-time data integration using the HVR Initialise action. This creates two new jobs in the HVR Scheduler which then need to be started:
One more thing to do of course: test that the channel is working and replications is happening in real-time. We applied a series of inserts, updates and deletes to the source system and monitored the log files for the two scheduled jobs to see the activity captured from the redo logs on the source:
and then applied as new transactions on the target:
The HVR Compare action allows us to confirm that the source and target are still in sync.
Clearly the scenario we are testing here is a simple one. HVR can do much more - supporting one-to-many, many-to-many and also bi-directional replication configurations. Nonetheless we were impressed with how easy it was to install and configure HVR and also with the simplicity of executing actions and monitoring the channel through the GUI. We dipped in to using the command line interface when executing some of the longer running jobs and this was straightforward too.