Delving in to ODI’s Web Services Support
February 20th, 2008 by Mark Rittman
Following on from my postings last week about future OBIEE architecture, I’ve spent a couple of evenings this week taking a closer look at the web services support in Oracle Data Integrator. Specifically, I was interested to see how ODI mappings and packages could be called by a BPEL process, and how data retrieved from a web service could be included in a mapping or process. I’m also indebted to Douglas Stevenson who’s written about this functionality in a forthcoming Oracle Scene article and who batted a few ideas about with me as I came up with my findings.
ODI in the current 10.1.3.4 release has three major areas of web service functionality;
- The Oracle Data Integrator Public Web Service, that lets you execute a scenario (a published package) from a web service call
- Data Services, which provide a web service over an ODI data store (i.e. a table, view or other data source registered in ODI)
- The ODIInvokeWebService tool that you can add to a package to request a response from a web service
The examples in the ODI documentation suggest installing the Oracle Data Integrator Public Web Services, and any Data Services you create, using the Apache Axis2 web services engine that you can then install into a Java application server such as Oracle OC4J. I took Douglas’ recommendation and downloaded Apache Axis2 1.2 from the Apache website, and installed it into the default OC4J home in Oracle Application Server 10.1.3.3.1 (the one that comes with Oracle SOA Suite 10g). Having a play around with the three bits of functionality, they seem to work in this way:
The Oracle Data Integrator Public Web Services is fairly straightforward and allows you to call an ODI scenario (in OWB terms, a process flow) via a web service call. A typical way you’d use ODI in this way is to call a bit of bulk data transformation as part of a business process, such as in the BPEL diagram below:

If you’ve read the recent OTN article I put together on BPEL and OBIEE, setting up ODI to work in this way is pretty similar. You add a reference to the ODI Public Web Service as a partner link in your BPEL process, then make calls to it using Invoke activities. When you add an Invoke activity that refers to the ODI Public Web Service you get a choice of three activities:
- Invoke Scenario
- Get Web Service Version, and
- Invoke Session
with the first activity being the one you’d need to choose.

Other than the Invoke activity, you’d also need to add a corresponding Assign activity before it to set the details of the scenario being called, the respository being connected to and so on. Once this is all in place, you can call a bit of bulk data transformation directly from your BPEL process and use this as a substitute for moving and transforming data on a message-by-message basis.
The second bit of ODI web service functionality is Data Services. Data Services allow you to take a data store in your ODI repository (usually a table or view, but it can also refer to anything that ODI recognizes as providing data) and give it a web service interface, so that outside applications can select from it, update it, delete from it and so on. ODI comes pre-configured to work with Apache Axis2, a web services engine that you can download free from the Apache website and installs pretty easily into the OC4J container in Oracle Application Server. Once you’ve defined an Axis2 data server in your Topology Manager application, like this:

you can then take any model in the Designer application and select datastores (tables, views etc) within it to publish as web services.

The technology behind this is contained in another type of knowledge module, a “Service Knowledge Module”, that takes data in for instance an Oracle database and provides the interface over it. Once you’ve selected your tables and imported the required knowledge module, you then publish the service to, for example, Axis2 wherapon you can then check out the endpoint.

The tricky bit, I found, with data services is getting the JDBC data source set up correctly. The web service that ODI generates doesn’t use your ODI connections but instead uses a JDBC data source on the application server, which in OC4J’s case needs to be set up using Enterprise Manager and then in the web.xml file for the Axis2 application. You then need to refer to it correctly in the Services tab for the data store in ODI, get any of these wrong and the service will appear to register correctly but won’t return any data, with any error messages being just cryptic java stack traces.
Anyway, the third bit of web services functionality in ODI is the ODIInvokeWebService tool, a bit of functionality you can invoke from the Package editor, analogous to process activities in OWB process flows.

This is the bit of functionality that allows you to bring in data requested via a web service into your ETL process. It works in a bit of a funny way though; first of all, its a package step, not an operation or activity you include in mapping, so you can’t “join” to a web service in an interface (analogous to a mapping in OWB), you have to request the data separately, put it into a relational table and then joint to that, if you want to refer to data from a web service in an interface.

Moreover, the way the ODIInvokeWebService tool works is that it makes a web service request, dumps the response into an XML file, and you then process the XML file as a document like any other XML document you might want to load.

Looking at this, it’s obviously not something you’d want to make lots of references to as part of a bulk loading ETL project, as each time you call the web service you’ll have to get the response, copy to an XML file, load the XML file and so on. There are ways around this if your web service can return data on multiple requests at once – the data services feature I mentioned earlier has a “getCustomerList” operaton that returns data on multiple customers in one go, which means you could get details on all the customers, for example, that you wanted data on and request all their data at the start of the ETL operation – but clearly in this implementation, you can’t treat data coming in from a web service as being functionally equivalent to data coming in from file or relational sources.
Then again though, giving the latency of web service calls, would you really want to do this though anyway, “joining” to web service like you would join to a table or file? It’s more likely you’d want to gather all your web service data up anyway and stage it into a high-performance table rather than slow down a table join by lots of HTTP requests out to a remote web server, and perhaps the example I’ve seen in Douglas’ upcoming article, where a web service is called when a package fails and we therefore need to get some human workflow intervention, is the more likely scenario.
Anyway, that’s the low-down on how web services functionality works in ODI 10.1.3.4. The ability to reference an ODI scenario from BPEL, for example, is a fairly obviousl clear win and of course this is a nice way to have BPEL control your ETL process orchestration. Data Services I’m not sure I’d have a use for, but if you ever need to web service enable a set of data mart tables, for example, then this’d be handy. The ODIInvokeWebservice tool made me think a bit though, in the past I’d thought of data coming in from web services as something you’d drag on to, for example, an OWB mapping and refer to just like any other data source in a mapping, but the way ODI deals with them, together with the inherent latency in such a technology, probably means you’ll gather data using them upfront in your mapping at the preparation stage, or you’ll use them as a way of invoking some workflow as part of your ETL control process.
Given all this though, I wonder how OWB will provide support for web services in the next release? SOA functionality is on the development plan for the next major release of OWB, due I guess in late 2008, I’d be curious to see whether it takes the same route as ODI (OWB will support ODI knowledge modules in the next release) or whether they make a stab at making web service data as usable in mappings as relational data? We’ll have to wait and see.
Anyway, thanks again to Douglas for some of the ideas I bounced around here, and for helping me get Axis2 set up. If you’re a UKOUG member, keep an eye out in the next Oracle Scene magazine for his in-depth look at ODI’s web service functionality.


February 21st, 2008 at 11:18 am
Hi Mark,
Excellent post, contains a lot of useful ideas about using ODI within a service oriented architecture.
I totally agree with you that there is enormous scope for invoking standard ODI scenarios as a web service via ODI Public web services. Effectively, this lets us co-ordinate the ETL process from BPEL or any other process orchestration tool.
This also ties in nicely with a point raised by Stewart Bryson, after your BIEE web service post. The ETL process is highly complex and demands a very high level of performance. So, quite rightly Stewart states that we want to avoid abstraction in the form of web services within the ETL process itself. ODI accomodates this quite nicely by allowing us to invoke a standard scenario as a web service – providing ‘bulk processing’ services. The key point is that there is no abstraction in the scenario itself, it can simply be invoked as part of a larger process, allowing the functionality of ODI to become part of composite business applications.
Data services? Mark, you make a good point about not being sure where they fit in. I personally think that there is scope for them within a SOA, as they accomodate a range of simple transactions. Granted that some people may think we would be better creating these in alternative ways that dont involve ODI. The only response I could give is that if we use these datastores already, we will have the metadata to hand within ODI – why not put it to further good use by letting ODI create these services for you? So, in some cases I think they would work.
I found the part about how ODI actaully handles web service data very interesting. First we have to invoke the service, then get the response file, then create an XML datasource based on it, then use it in a scenario where the data will be staged in a table to join to other data. A little bit much to be honest. But, as we’ve discussed and Stewart mentioned, I’m not sure we would want to use web services in this way (to actually perform bulk processing steps). For the time being, I think it’s probably more realistic for us to invoke services from an ODI package as part of the processing logic, rather than for returning data to use within the flow.
Overall, I think you’ve highlighted that ODI is well equipped to adopt a dynamic role within a service driven architecture. Look forward to seeing how it evolves, and if they can clear up some of the slight niggles that you’ve identified.
Cheers,
Douglas
February 27th, 2008 at 11:42 pm
Mark,
excellent article. Keep up the good work.
I would like to mention one point about ODI though. Because most of the processing happens within Database engine, unless all commercial scale databases treats web services as federated data source, ODI as a tool can not join them with another set of tables. Other ETL tools can join them because unlike ODI they bring in each source row in the server memory, but I seriously doubt the performance in such cases.
Just a random thought.
February 28th, 2008 at 6:56 am
Sid,
I think you’re right there, the fact that the database doesn’t treat web services the same as any other data source (at least not yet) means you can’t just “join” to a web service, but as you say, would you want to? If you compare data coming in from a table with data coming in from a web service, they’re just completely different in terms of latency, speed and so on, so it’s kind of inevitable that you’d need to go and retrieve the web service data first, stage it in a relational table and then work with it.
The thing is though, you have to go through the process of trying to incorporate web service data into an ETL flow before this sort of thing dawns on you, for me certainly I kind of assumed that you could treat web service data just like a relational data source and clearly, that’s not the case. I still think the ODI implementation is kind of clumsy (read from the web service, load into an XML doc, define a data source over the XML doc, read from that into a table) but the basic principle is correct, you have to stage the XML data before you can incorporate it into an SQL transformation.
regards, Mark
February 28th, 2008 at 9:39 pm
Mark,
“I still think the ODI implementation is kind of clumsy” – I guess that is the point of view. I agree that all this work makes it more time consuming to implement it. On the other hand, this produces control at a deeper level. Lemme elaborate.
Call web services, build a script that will parse the first node of the XML to get the columns and create the table on that basis, call the script through ODIOSCOMMAND step, load that table and may be you can publish this table as a datastore or fit this into some kind of common view. Or just get the XML data and ftp it to some other application. Or parse the XML and redistribute the data to multiple target web applications. Or transform the XML to create a new type of report or another XML/CSV that is needed by a legacy application. Each of these actions can be put together into a common package that will create a scenario. With web services, possibilities are endless. The beauty of the tool is that it can be used as an ultimate router of various kinds of data.
I guess we are yet not ready to exploit this powerful thing and on the flip side data integrator has got some usablity/reliablity issues that got to be corrected. For example, if you try to join two tables from the same source database but different schema, it decides to join them in staging are which is almost same as target db. Recently I tried to solve the problem and got frustrated with the tool But I like it more than datastage and informatica.
Nice blog. Your articles are a great source for me.
Regards,
Sid
April 17th, 2008 at 2:33 pm
Hi
Thanks for the blog,it was of great help.I could successfully generate and deploy the data model.But could not open the page from the url provided in the service page of Apache Axis2 deployed on OC4J.It is saying page not found.
Please help