In yesterday’s post I took a look at one of the new features in the 184.108.40.206.1 release of the BI Applications; integration between BI Apps 11g and Oracle Endeca Information Discovery. Whilst we’re on the topic then, I thought it’d be worth taking a look at another new feature introduced with BI Apps 220.127.116.11.1 - data lineage and impact analysis.
So what exactly is data lineage, and impact analysis? Data lineage is the path that data takes through your system from source to the final target reports and dashboards, and describes the lifecycle from raw data through to processed, validated and transformed information presented to your users. Impact analysis is what you do to determine what downstream data items will be affected by a change to a source table, column or data mapping, and has been a feature in many Oracle data integration tools in the past including Oracle Warehouse Builder and ODI11g, as shown in the screenshots below.
You could also trace data lineage through the earlier 7.9.x releases of Oracle BI Applications by starting at the DAC Console, which recorded the source table and columns for a particular target warehouse table, and the DAC tasks (usually corresponding to Informatica workflows of the same name) used to load those tables. The DAC stopped at the warehouse layer though and didn’t contain any details of the dashboards and reports that used the warehouse data, and so if you wanted to trace data lineage back from a particular report or presentation layer column you had to step through the process manually in a way that I illustrated in this blog post from a few years ago. What these new data lineage and impact analysis features in BI Apps 18.104.22.168.1 do for us is bring the two sets of metadata together, along with configuration data from the BI Applications Configuration Manager application, to create an end-to-end data lineage view of the BI Apps dataset.
Data within the BI Apps 22.214.171.124.1 system can be thought of as going through seven different layers, as shown in the diagram below. Starting at the top-level dashboards and reports, these map onto OBIEE presentation tables and columns, which in-turn are selected from business model columns that then map back to the physical tables and columns in the Oracle Business Intelligence Applications data warehouse. These data warehouse tables are loaded in two stages, first using source-specific data mappings from for example Oracle E-Business Suite 12.1.3, and then using a set of source-independant mappings that take standardised staging datasets from these sources and map them into the target data warehouse tables. Our data lineage and impact analysis routines have to be aware of these seven stages and show us how data moves and is transformed between each stage.
The way that BI Apps 126.96.36.199.1 data lineage works is to use ODI11g to extract metadata from the BI Apps Configuration Manager underlying tables, and the ODI11g repository, and combine that with RPD and catalog metadata you have to manually extract and copy into files for ODI to also upload. An ODI load plan supplied by Oracle then combines these datasets into a final set of data lineage tables also stored on the target data warehouse schema, and you can create your own data lineage and impact analysis reports or start with the ones Oracle also provide with this new feature.
To load these data lineage tables, you run a predefined load plan from ODI Studio or ODI console after checking all connections to the various sources are set up correctly. The load plan in-turn runs a number of interfaces that load lineage information from the RPD and catalog extracts, BI Configuration Manager tables and ODI repository tables, with this load plan having to run outside of the main BI Apps managed data loads - which makes sense as you have to manually re-extract the RPD and catalog metadata anyway, and you’ll probably want to run the data lineage reload after every development release of the BI Apps system rather than every day, for example.
Once you’ve loaded the data lineage tables, the subject area you can then select from to create lineage and impact reports covers all the stages in the data load, and also extends to OTBI (Oracle Transactional Business Intelligence, more on that in a future post) if you use that in combination with the BI Apps (or OTBI EE, as it’s called for cloud-based Fusion installations).
You also get a set of starter dashboards and analyses for displaying the lineage for dashboard objects, presentation tables and columns, down to tables and columns in the BI Apps data warehouse, and impact for source models, columns, variables and so on.
It’s definitely a good start, a useful resource. Going back to the days of OWB it’d be nice if this were build-in directly into ODI Studio, and the steps to identify and then export the RPD and catalog metadata are pretty manual, but it’s better than having to step through the metadata layers yourself as you had to do with the previous 7.9.x versions of the BI Apps. More details on data lineage and impact analysis in BI Apps 188.8.131.52.1 can be found in the online docs, including the configuration steps you’ll need to carry out before doing the first data lineage load.