Deploying ODI 11g Agents for High-Availability and Load-Balancing

A couple of weeks ago I was with one of our clients, who were planning their migration from home-grown ETL to using Oracle Data Integrator 11g. One of their requirements was for the agents that ODI uses to be deployed in a high-availability environment, so that for example:

  • More than one agent was running at one time, to spread the load and ensure that at least one was available at all times to run ETL jobs
  • If an agent is scheduled to run an ETL job but it's not available, another one can run the job instead
  • Their Oracle RAC database was fully-used, so that if one database node was down, the other could still provide the repository and be the target for ETL jobs

ODI for some time has had a basic load-balancing capability, so that for example you can register two or more agents as part of a load-balancing group, like this:

NewImage

But load-balancing isn't high-availability, and if you submit an agent execution request to a standalone agent that's part of such a load-balancing group, but that agent isn't actually running, then the request will fail. In addition, if a particular standalone agent has a scheduled ETL job associated with it, but the agent is down when it's time to run the job, the job just won't run as there's no in-built method for another standalone agent to be aware of this scheduled job.

Where this also gets interesting is when the database that's the host for the ODI repositories is a RAC database, or potentially the target database for the DW load is a RAC database. Where do you install the agents (one per RAC node?) and can we, for example, use the RAC VIP (virtual IP address) to reference the two ODI agents instead of using this load balancing group, so that the RAC VIP gives us our load balancing, and potentially high-availability?

Looking at standalone agents first, it's clear that these are designed really to support load-balancing, but not high-availability (of the agents). When you connect agents in a load-balancing group to a repository held on a RAC database, you make the connection using the special RAC connection syntax, and the repository (and the RAC database) is treated as a single logical object, regardless of the fact that there are two physical database nodes in the background. But the standalone agents themselves don't handle failover when a scheduled job fails to execute or fails half-way through, and one can't substitute for the other if an execute request comes through to a non-running agent. To handle this situation, Oracle introduced the Java EE (JEE) agent with ODI 11g, which runs in the WebLogic application server rather than standalone on your server.

This blog post by Sachin Thatte, the development manager for the ODI agent, describes how ODI agents can be deployed within WebLogic to provide high-availability, which includes the diagram below that shows the ODI HA topology.

NewImage

I described back in 2010 how ODI 11g now made use of WebLogic Server to provide JEE agents, and supported web-based applications such as Enterprise Manager and ODI Console. At the time it appeared that ODI was just doing the sensible thing and putting agents in a proper JEE application server, but the agents that you deploy in WebLogic have a number of improvements over standard, standalone agents so that:

  • They have fault-handing options to handle failed ad-hoc and scheduled jobs,
  • They make use of WebLogic's clustered managed server approach, and can migrate jobs from agent to agent if a managed server goes down

In addition to the JEE agents running on clustered WebLogic Servers, you would deploy Oracle HTTP Server (OHS) in front of the managed servers, and a load-balancer in front of those to hand-off incoming execution requests to the appropriate managed server, as well as your repositories on a RAC database (typically with Data Guard in the background), so that you've got some protection from database nodes going down.

For more information on ODI agent high-availability using WebLogic Server, there's a couple of useful resources you can take a look at:

We also cover agents, WebLogic and load-balancing/high-availbility on our ODI 11g course, if you're interested - just drop us an email for details.