OBIEE High Availability – The BI Server

December 21st, 2008 by Borkur Steingrimsson

At the beginning of the month, the majority of the Rittman Mead Consulting team was at the UKOUG conference in Birmingham. Jon Mead and myself presented a paper called “OBIEE High Availability”. What we discussed was the different components of the Oracle BI EE stack, how they can be clustered and some of the main features of such a setup.

The OBIEE architecture, in it’s simplest form, is comprised of the Oracle BI Server, Oracle BI Presentation Server, Oracle BI Java Host and the J2EE container that runs the web application (let’s just call that bit OC4J , though it doesn’t have to be hosted in an Oracle Container 4 Java). Behind the Oracle BI Server we might expect to find several different data sources, but clustering those is well outside the scope of this write-up.

OBIEE_High_Single_Point_of_failure.png

In your regular installation we always have a single point of failure. If any one server goes down or looses network connectivity then we have lost all service. Making each node redundant we reduce the risk of a single point bringing down the whole stack

OBIEE_High_High_available.png

Setting the BI Server to a clustered mode

The clustering of a BI server and the cluster controller are configured in the NQSConfig.INI and NQClusterConfig.INI files, found in the ORACLEBI/server/Config/ directory. Once you have installed the BI server on all the dedicated servers, you need to configure each and every node to join the cluster we are creating. One server must be chosen to be the primary cluster control. This server will be responsible for receiving connection requests to the cluster, and forward that request down to one of the servers participating in the cluster. We can also configure a secondary cluster controller, to act as a backup if the master goes down. If you don’t bother setting up a secondary cluster controller, you have introduced a single point of failure.

Here are some of the main configuration parameters that need to be set in the NQSConfig.INI file

  • CLUSTER_PARTICIPANT = YES;
  • REPOSITORY_PUBLISHING_DIRECTORY = “/media/share/Repository”;
  • REQUIRE_PUBLISHING_DIRECTORY = YES;

The first one tells the BI server to look in to the NQClusterConfig.INI file for further information on how to connect to a cluster. The second parameter points to a shared directory on the network, where all BI Servers in the cluster must be able to find the common repository and write back any modifications. The third parameter tells the BI server that it should not be able to start if the Publishing Directory can not be found.

The configuration of the Cluster Controller and the BI Server clustering is set in the NQClusterConfig.INI file. The Cluster Control must be enabled and then we must list the primary (and, optionally) the secondary cluster controllers. Each node must also know about all the BI servers in the cluster, and which server is dedicated to contain the master copy of the repository.

  • ENABLE_CONTROLLER = YES;
  • PRIMARY_CONTROLLER = “aravis2.rmcvm.com”;
  • SECONDARY_CONTROLLER = “aravis3.rmcvm.com”;
  • SERVERS = “aravis2.rmcvm.com”,”aravis3.rmcvm.com”;
  • MASTER_SERVER = “aravis2.rmcvm.com”;  

The next step is top copy the .RPD file in to the /media/share/Repository mount point. (If you are setting this up on Windows machines, then you must share a drive on the network and refer to the share something like this: REPOSITORY_PUBLISHING_DIRECTORY=”aravis0\\share\Repository”)

In our setting here, we have one BI server and the Primary Cluster Controller running on aravis2 and a second BI server and the Secondary Cluster controller running on aravis3. If we now start the Cluster Controller on aravis1, we see the following

[oracle@aravis2 setup]$ ./run-ccs.sh start
Oracle BI Cluster Controller startup initiated.
Execute the following command to check the Oracle BI Cluster Controller logfile and see if it started.
tail -f /u01/app/oracle/product/obiee/OracleBI/server/Log/NQCluster.log
[oracle@aravis2 setup]$ tail -f ../server/Log/NQCluster.log
[71030] A connection with Cluster Controller aravis3.rmcvm.com:9706 was established.
2008-12-21 15:43:59
[71020] A connection with Oracle BI Server aravis3.rmcvm.com:9703 was established.
2008-12-21 15:43:59
[71010] Oracle BI Server aravis3.rmcvm.com:9703 has transitioned to ONLINE state.
2008-12-21 15:43:59
[71027] Cluster Controller aravis3.rmcvm.com:9706 has transitioned to ONLINE state.

Now make sure that the cluster controllers and BI servers are up and running on both instances.

Connecting to the BI Cluster

The Oracle BI ODBC driver is configured, by default, to connect to a regular non-clustered instance. Now that the Primary Cluster Controller is responsible for all connections, we need to configure a new DSN for the Administrator tool.

Admin_config.png

If we now start the Administrator tool and connect to the newly created ClusterController connection, we should see something like the following, if we start up the Cluster Manager Tool. Here we can see the state of each BI server and cluster controller.

cluster_controller.png

Once we have established that the BI servers are both members of the cluster and the primary and secondary cluster controllers are running we can move on and change the ODBC connection for the Presentation Server. For the time being we are satisfied that the Presentation Server is a potential single point of failure. My Presentation Server is called aravis1.rmcvm.com. I now must edit the odbc.ini file, that we find in the ORACLEBI/setup/ directory. This configuration file defines all ODBC connections to the BI server, and now we are interested in creating a similar connection as we did above, for the Administration tool

[AnalyticsWeb_Cluster]
Driver=/u01/app/oracle/product/obiee/OracleBI/server/Bin/libnqsodbc.so
Description=Oracle BI Server
ServerMachine=local
Repository=
FinalTimeOutForContactingCCS=60
InitialTimeOutForContactingPrimaryCCS=5
IsClusteredDSN=Yes
Catalog=
UID=Administrator
PWD=
Port=9703
PrimaryCCS=aravis2.rmcvm.com
PrimaryCCSPort=9706
SecondaryCCS=aravis3.rmcvm.com
SecondaryCCSPort=9706
Regional=No

The Presentation Server then needs to be configured to use this ODBC connection, called AnalyticsWeb_Cluster, instead of the default AnalyticsWeb. Edit the ORACLEBIDATA/web/config/instanceconfig.xml file and change the ODBC connection name in the DSN tags

<WebConfig>
<ServerInstance>
<DSN>AnalyticsWeb_Cluster</DSN>
<CatalogPath>/media/share/WebCat/catalog/samplesales</CatalogPath>

Restart the Presentation server and you should be ready to go. Log in to the Analytics web application and monitor, using the Administration tool, and see how you are now, in a round robin manner, get sessions created on each BI server.

OBIEE_High_Connection_manager.png

Next we look at how to set up multiple presentation services to make that layer more fault tolerant , as well as putting the Java Host and Scheduler in to the cluster.

Comments

  1. RNM Says:

    Excellent article, clearly explained.

    In the final paragraph you mention a subsequent article detailing multiple presentation services – will this be published soon?

    Thanks.

  2. Borkur Steingrimsson Says:

    Hi RNM,

    the sequel has been posted.

    cheers
    Borkur

  3. Sid Says:

    Borkur,

    Nice article. I read the sequel also. This looks like an Active/Active combination of the BI servers, not an Active/Passive combination. However, the active/active clusters are generally useful for performance enhancement but for high availability generally we rely on active/passive configurations. My questions are two-fold:
    1. If one of these services go down, would that other cluster takeover immediately without any need to restart the cluster services in any of the primary/secondary nodes? Would the query execution in node 2 be affected if the node 1 goes down?
    2.If the secondary node takes over all connections if primary goes down, would that be applicable for presentation services too? What happens if presentation service in node 1 fails but bi service does not, would that mean that presentation service in node 2 will return the result of the query that PS in node 1 requested before it went down?
    3. Would the performance be severely affected if one of the nodes go down and other node is able to take over?

  4. Borkur Steingrimsson Says:

    Hi Sid.
    You are right in observing that the stack is mainly acting in an active/active mode. The cluster controllers (CCS) are active/passive though. As to your two-fold question (in 3 parts :)

    1. If one of the BI server services goes down, the CCS will notice and will not relay any queries to that node until it goes online again. As the sessions in the BI Server are not persistent, any query that comes down the pipe can eventually be sent to either/any node. Your session will experience a slight hiccup when the BI server goes down, but next refresh or Answers request will automagically be sent to one of the remaining BI servers.

    2. There is no state-replication between the two nodes, so if a presentation server goes down your session will die and you will have to log in again. Upon login, the OC4J application will not be able to give you a connection to the dead PS server and will try the next one in the list until a connection is made. Any query that was running on the BI server will be lost.

    3. If the remaining services aren’t too overloaded and are able to handle the added workload, then the answer is no. But if your entire stack was operating at peak capacity already, at the time of service demise, then users would notice performance degradation.

    Hope this helps

    Cheers
    Borkur

  5. Sid Says:

    Borkur,

    Thanks for a response. In my last assignment, I had to deal with a active/passive win 2003 cluster with OBIEE installed on them. I tried OBIEE clustering first but then decided to leave the idea because this active/active configuration would create anomalies in a active/passive cluster. I just needed to understand whether I was correct or not and I guess I was.
    Nice article. Can we get a primer about the UDML language that is embedded in repository and using web services ?

  6. Tan Says:

    Hi Borkur,
    For this configuration, would it be an “active-active”? Is it possible to create a “active-passive” cluster using this method? Or using MSCS/HACMP clustering?
    Regards.

  7. Deepak Says:

    Hi Borkur,
    I have an issue while Configuring a DSN for the Administrator tool.
    I have successfully clustered two Linux servers and now when I am trying to Configuring a DSN for the Administrator tool on a windows machine I am getting the following Error:

    [nQSError:69009] Test connection to at least one Cluster Controllers succeeded. However, the connection test to the following Oracle BI Server node failed:.

    Any idea, what could be the possible cause of this error.

    Thanks,
    Deepak

  8. Borkur Steingrimsson Says:

    @Tan I don’t see how you could achieve active-passive by using the cluster controller. But if you do have some hardware configuration then I suppose it might be possible. I certainly haven’t tried it though

    @Deepak My guess is that even though the cluster control services are running they are not able to talk to the BI servers. Take a look at both the BI server logs and the cluster controller logs to make sure that all the components are in fact communicating. Also, make sure that you can connect directly to a BI server (i.e. create a DSN that points directly to one of the BI servers – this should also work fine)

  9. Jan Says:

    Hi Borkur,

    I am busy with a clustered setup.

    You say that I should copy the repository file to a shared location, however, the NQSconfig.ini file forces the Repository to reside in the OracleBI\server\Repository directory. Do I need to create a link from this directory to the shared directory, or how will the BIEE setup on any of the nodes be able to pick up the Repository file in the changed directory? If I move the Repository files to the directory specified in the REPOSITORY_PUBLISHING_DIRECTORY parameter, then I get errors in the logs when trying to start up the servers due to it not finding the Repos in the default directory.

    Please Assist.

    Regards,

    Jan

  10. Borkur Steingrimsson Says:

    Hi Jan.

    Well, you still have to leave the RPD file in the original place. The shared directory is actually used to communicate changes made to the repository (of the master BI server) to the other BI servers in the cluster. They don’t share the RPD file but each one uses its own file, stored locally.

    hope this helps

    Borkur

  11. Jan Says:

    Hi Borkur,

    It helps a lot…

    There really isn’t many comprehensive documentation on this readily available…

    Regards,

    Jan

  12. ale sabelli Says:

    I’ve an issue after configured a BIEE cluster. I execute some test shutting down just the bi server component in one server and I have an intermittent working on the system. The cluster controller wasn’t able to feel the BI server down.
    Any idea?
    Regards
    Ale

  13. Pieter Says:

    Thanks for the good note – very usefull.
    Where do I find the Cluster Manager Tool that you’re referring to.
    Thanks

Write a comment





Website Design & Build: tymedia.co.uk