12.2, the latest Oracle database release provides a whole set of new features enhancing various aspects of the product including JSON support, Auto-List Partitioning and APEX news among others.
One of the biggest news in the Data Warehousing / Analytics area was the introduction of the Analytic Views, that as per Oracle's definition are
Metadata objects that enable the user to quickly and easily create complex hierarchical and dimensional queries on data in database tables and views
If you are on rush, here is an abstract of what you'll find in this looooong blog post:
Metadata modeling can now be done directly in the database using Analytic Views, providing to end users a way of querying database objects without needing a knowledge of joining conditions, aggregation functions or order by clauses.
This post will guide you through the creation of an analytic view that replicates a part of a OBIEE's Sampleapp business model. The latest part of the post is dedicated to understanding the usage of analytic views and the benefits for end users especially in cases when self-service BI tools are used.
If you are still around and interested in the topic please take a drink and sit comfortably, it will be a good journey.
What are then the Analytics Views in detail? How are they going to improve end user's ability in querying data?
To answer above question I would take a step back. Many readers of this blog are familiar with OBIEE and its core: the Repository. The repository contains the metadata model from the physical sources till the presentation areas and includes the definition of:
- Joins between tables
- Hierarchies for dimensions
- Aggregation rules
- Security settings
- Data Filters
- Data Sources
This allows end users to just pick columns from a Subject Area and display them in the appropriate way without needing to worry about writing SQL or knowing how the data is stored. Moreover definitions are held centrally providing the famous unique source of truth across the entire enterprise.
The wave of self-service BI tools like Tableau or Oracle's Data Visualization Desktop provided products capable of querying almost any kind of data sources in a visual and intuitive way directly in the end user hands. An easy and direct access to data is a good thing for end user but, as stated above, requires knowledge of the data model, joins and aggregation methods.
The self-service tools can slightly simplify the process by providing some hints based on column names, types or values but the cruel reality is that the end-user has to build the necessary knowledge of the data source before providing correct results. This is why we've seen several times self-service BI tools being "attached" to OBIEE: get corporate official data from the unique source of truth and mash them up with information coming from external sources like personal Excel files or output of Big Data processes.
Analytic Views (AV) take OBIEE's metadata modeling concept and move it at database level providing a way of organizing data in a dimensional model so it can be queried with simpler SQL statements.
The Analytical Views are standard views with the following extra options:
- Enable the definition of facts, dimensions and hierarchies that are included in system-generated columns
- Automatically aggregate the data based on pre-defined calculations
- Include presentation metadata
Analytics views are created with a
CREATE ANALYTIC VIEW statement, some privileges need to be granted to the creating user, you can find the full list in Oracle's documentation.
Every analytical view is composed by the following metadata objects:
- Attribute dimensions: organising table/view columns into attributes and levels.
- Hierarchies: defining hierarchical relationships on top of an attribute dimension object.
- Analytic view objects: defining fact data referencing both fact tables and hierarchies.
With all the above high level concepts in mind it's now time to try how Analytical Views could be used in a reporting environment.
- Install Docker
- Download database installer from Oracle's website
- Place the installer in the proper location mentioned in the documentation
- Build Oracle Database 22.214.171.124 Enterprise Edition Docker image by executing
./buildDockerImage.sh -v 126.96.36.199 -e
- Running the image by executing
docker run --name db12c -p 1521:1521 -p 5500:5500 -e ORACLE_SID=orcl -e ORACLE_PDB=pdborcl -e ORACLE_CHARACTERSET=AL32UTF8 oracle/database:188.8.131.52-ee
The detailed parameters definition can be found in the GitHub repository. You can then connect via sqlplus to your local instance by executing the standard
sqlplus sys/pwd@//localhost:1521/pdborcl as sysdba
The password is generated automatically during the first run of the image and can be found in the logs, look for the following string
ORACLE AUTO GENERATED PASSWORD FOR SYS, SYSTEM AND PDBAMIN: XXXXxxxxXXX
Once the database is created it's time to set the goal: I'll try to recreate a piece of the Oracle's Sampleapp RPD model in the database using Analytic Views.
In this blog post I'll look in the
01 - Sample App business model and specifically I'll try to replicate the logic behind Time, Product and the
F0 Sales Base Measures using Analytic Views.
D1 - Products (Level Based Hierarchy) is based on two logical table sources:
SAMP_PRODUCTS_D providing product name, description, LOB and Brand and the
SAMP_PROD_IMG_D containing product images. For the purpose of this test we'll keep our focus on
The physical mapping of Logical columns is shown in the image below.
The first piece we're going to build is the attribute dimension, where we'll be defining attributes and levels. The mappings in above image can "easily" be translated into an attributes with the following SQL.
CREATE OR REPLACE ATTRIBUTE DIMENSION D1_DIM_PRODUCT USING SAMP_PRODUCTS_D ATTRIBUTES (PROD_KEY as P0_Product_Number CLASSIFICATION caption VALUE 'P0 Product Number', PROD_DSC as P1_Product CLASSIFICATION caption VALUE 'P1 Product', TYPE as P2_Product_Type CLASSIFICATION caption VALUE 'P2 Product Type', TYPE_KEY as P2k_Product_Type CLASSIFICATION caption VALUE 'P2k Product Type', LOB as P3_LOB CLASSIFICATION caption VALUE 'P3 LOB', LOB_KEY as P3k_LOB CLASSIFICATION caption VALUE 'P3k LOB', BRAND as P4_Brand CLASSIFICATION caption VALUE 'P4 Brand', BRAND_KEY as P4k_Brand CLASSIFICATION caption VALUE 'P4k Brand', ATTRIBUTE_1 as P5_Attribute_1 CLASSIFICATION caption VALUE 'P5 Attribute 1', ATTRIBUTE_2 as P6_Attribute_2 CLASSIFICATION caption VALUE 'P6 Attribute 2', SEQUENCE as P7_Product_Sequence CLASSIFICATION caption VALUE 'P7 Product Sequence', TOTAL_VALUE as P99_Total_Value CLASSIFICATION caption VALUE 'P99 Total Value')
Few pieces to note:
CREATE OR REPLACE ATTRIBUTE DIMENSION: we are currently defining a dimension, the attributes and levels.
USING SAMP_PRODUCTS_D: defines the datasource, in our case the table
SAMP_PRODUCTS_D. Only one datasource is allowed per dimension.
PROD_KEY as P0_Product_Number: using the standard notification
aswe can easily recaption columns names
CLASSIFICATION CAPTION ...several options can be added for each attribute like caption or description
The dimension definition is not complete with only attribute declaration, we also need to define the levels. Those can be taken from OBIEE's hierarchy
For each level we can define:
- The level name, caption and description
- The Key
- the Member Name and Caption
- the Order by Clause
Translating above OBIEE's hierarchy levels into Oracle SQL
LEVEL BRAND CLASSIFICATION caption VALUE 'BRAND' CLASSIFICATION description VALUE 'Brand' KEY P4k_Brand MEMBER NAME P4_Brand MEMBER CAPTION P4_Brand ORDER BY P4_Brand LEVEL Product_LOB CLASSIFICATION caption VALUE 'LOB' CLASSIFICATION description VALUE 'Lob' KEY P3k_LOB MEMBER NAME P3_LOB MEMBER CAPTION P3_LOB ORDER BY P3_LOB DETERMINES(P4k_Brand) LEVEL Product_Type CLASSIFICATION caption VALUE 'Type' CLASSIFICATION description VALUE 'Type' KEY P2k_Product_Type MEMBER NAME P2_Product_Type MEMBER CAPTION P2_Product_Type ORDER BY P2_Product_Type DETERMINES(P3k_LOB,P4k_Brand) LEVEL Product_Details CLASSIFICATION caption VALUE 'Detail' CLASSIFICATION description VALUE 'Detail' KEY P0_Product_Number MEMBER NAME P1_Product MEMBER CAPTION P1_Product ORDER BY P1_Product DETERMINES(P2k_Product_Type,P3k_LOB,P4k_Brand) ALL MEMBER NAME 'ALL PRODUCTS';
There is an additional
DETERMINES line in above sql for each level apart from
Brand, this is how we can specify the relationship between level keys. If we take the
Product_LOB example, the
DETERMINES(P4k_Brand) defines that any LOB in our table automatically determines a Brand (in OBIEE terms that LOB is a child of Brand).
Next step is defining a hierarchy on top of the attribute dimension
D1_PRODUCTS defined above. We can create it just by specifying:
- the attribute dimension to use
- the list of levels and the relation between them
which in our case becomes
CREATE OR REPLACE HIERARCHY PRODUCT_HIER CLASSIFICATION caption VALUE 'Products Hierarchy' USING D1_DIM_PRODUCT (Product_Details CHILD OF Product_Type CHILD OF Product_LOB CHILD OF BRAND);
When looking into the hierarchy
Product_hier we can see that it's creating an OLAP-style dimension with a row for each member at each level of the hierarchy and extra fields like
The columns contained in
- One for each Attribute defined in attribute dimension
- The member name, caption and description and unique name
- The level name in the hierarchy and related depth
- The relative order of the member in the hierarchy
- A field
IS_LEAFflagging hierarchy endpoints
- References to the parent level
Member Unique Names
A particularity to notice is that the
Cell Phones is
[PRODUCT_TYPE].& which is the concatenation of the
LEVEL and the
One could expect the member unique name being represented as the concatenation of all the preceding hierarchy members, Brand and LOB, and the member key itself in a string like
This is the default behaviour, however in our case is not happening since we set the
DETERMINES(P3k_LOB,P4k_Brand) in the attribute dimension definition. We Specified that Brand (
) and LOB (
) can automatically be inferred by the Product Type so there is no need to store those values in the member key. We can find the same setting in OBIEE's Product Type logical level
D0 Dim Date can be built starting from the table
SAMP_TIME_DAY_D following the same process as above. Like in OBIEE, some additional settings are required when creating a time dimension:
DIMENSION TYPE TIME: the time dimension type need to be specified
LEVEL TYPE <LEVEL_NAME>: each level in the time hierarchy needs to belong to a precise level type chosen from:
Taking into consideration the additional settings, the
Dim Date column mappings in above image can be translated in the following attribute dimension SQL definition.
CREATE OR REPLACE ATTRIBUTE DIMENSION D0_DIM_DATE DIMENSION TYPE TIME USING SAMP_TIME_DAY_D ATTRIBUTES (CALENDAR_DATE AS TOO_CALENDAR_DATE, PER_NAME_MONTH AS T02_PER_NAME_MONTH, PER_NAME_QTR AS T03_PER_NAME_QTR, PER_NAME_YEAR AS T04_PER_NAME_YEAR, DAY_KEY AS T06_ROW_WID, BEG_OF_MTH_WID AS T22_BEG_OF_MTH_WID, BEG_OF_QTR_WID AS T23_BEG_OF_QTR_WID ) LEVEL CAL_DAY LEVEL TYPE DAYS KEY TOO_CALENDAR_DATE ORDER BY TOO_CALENDAR_DATE DETERMINES(T22_BEG_OF_MTH_WID, T23_BEG_OF_QTR_WID,T04_PER_NAME_YEAR) LEVEL CAL_MONTH LEVEL TYPE MONTHS KEY T22_BEG_OF_MTH_WID MEMBER NAME T02_PER_NAME_MONTH ORDER BY T22_BEG_OF_MTH_WID DETERMINES(T23_BEG_OF_QTR_WID,T04_PER_NAME_YEAR) LEVEL CAL_QUARTER LEVEL TYPE QUARTERS KEY T23_BEG_OF_QTR_WID MEMBER NAME T03_PER_NAME_QTR ORDER BY T23_BEG_OF_QTR_WID DETERMINES(T04_PER_NAME_YEAR) LEVEL CAL_YEAR LEVEL TYPE YEARS KEY T04_PER_NAME_YEAR MEMBER NAME T04_PER_NAME_YEAR ORDER BY T04_PER_NAME_YEAR ALL MEMBER NAME 'ALL TIMES';
You may have noticed a different mapping of keys, member names and order by attributes. Let's take the
CAL_MONTH as example. It's defined by two columns
BEG_OF_MTH_WID: used for joins and ordering
PER_NAME_MONTH: used as "display label"
PER_NAME_MONTH in the
YYYY / MM format could be also used for ordering, but most of the times end user requests months in the
MM / YYYY format. Being able to set a ordering column different from the member name allows us to properly manage the hierarchy.
Time hierarchy follows the same rules as the product one, no additional settings are required.
CREATE OR REPLACE HIERARCHY TIME_HIER USING D0_DIM_DATE (CAL_DAY CHILD OF CAL_MONTH CHILD OF CAL_QUARTER CHILD OF CAL_YEAR);
The last step in the journey is the definition of the analytic view of the fact table that as per Oracle's documentation
An analytic view specifies the source of its fact data and defines measures that describe calculations or other analytic operations to perform on the data. An analytic view also specifies the attribute dimensions and hierarchies that define the rows of the analytic view.
The analytic view definition contains the following specifications:
- The data source: the table or view that will be used for the calculation
- The columns: which columns from the source objects to use in the calculations
- The attribute dimensions and hierarchies: defining both the list of attributes and the levels of the analysis
- The measures: a set of aggregations based on the predefined columns from the data source.
Within analytical views definition a materialized view can be defined in order to store aggregated values. This is a similar to OBIEE's Logical Table Source setting for aggregates.
Analytic View Definition
For the purpose of the post I'll use
SAMP_REVENUE_F which is one of the sources of
F0 Sales Base Measures in Sampleapp. The following image shows the logical column mapping.
The above mappings can be translated in the following SQL
CREATE OR REPLACE ANALYTIC VIEW F0_SALES_BASE_MEASURES USING SAMP_REVENUE_F DIMENSION BY (D0_DIM_DATE KEY BILL_DAY_DT REFERENCES TOO_CALENDAR_DATE HIERARCHIES ( TIME_HIER DEFAULT), D1_DIM_PRODUCT KEY PROD_KEY REFERENCES P0_Product_Number HIERARCHIES ( PRODUCT_HIER DEFAULT) ) MEASURES (F1_REVENUE FACT REVENUE AGGREGATE BY SUM, F10_VARIABLE_COST FACT COST_VARIABLE AGGREGATE BY SUM, F11_FIXED_COST FACT COST_FIXED AGGREGATE BY SUM, F2_BILLED_QTY FACT UNITS, F3_DISCOUNT_AMOUNT FACT DISCNT_VALUE AGGREGATE BY SUM, F4_AVG_REVENUE FACT REVENUE AGGREGATE BY AVG, F21_REVENUE_AGO AS (LAG(F1_REVENUE) OVER (HIERARCHY TIME_HIER OFFSET 1)) ) DEFAULT MEASURE F1_REVENUE;
Some important parts need to be highlighted:
USING SAMP_REVENUE_F: defines the analytic view source, in our case the table
DIMENSION BY: this section provides the list of dimensions and related hierarchies to take into account
KEY BILL_DAY_DT REFERENCES TOO_CALENDAR_DATE: defines the join between the fact table and attribute dimension
HIERARCHIES (TIME_HIER DEFAULT): multiple hierarchies can be defined on top of an attribute dimension and used in an analytical view, however like in OBIEE only one will be used by default
F1_REVENUE FACT REVENUE AGGREGATE BY SUM: defines the measure with alias, source column and aggregation method
F2_BILLED_QTY FACT UNITS: if aggregation method is not defined it replies on default
F21_REVENUE_AGO: new metrics can be calculated based on previously defined columns replicating OBIEE functions like time-series. The formula
(LAG(F1_REVENUE) OVER (HIERARCHY TIME_HIER OFFSET 1))calculates the equivalent of the OBIEE's
AGOfunction for each level of the hierarchy.
DEFAULT MEASURE F1_REVENUE: defines the default measure of the analytic view
Using Analytic Views
After the analytic view definition, it's time to analyse what benefits end users have when using them. We are going to take a simple example: a query to return the Revenue and Billed Qty per Month and Brand.
Using only the original tables we would have the following SQL
SELECT D.CAL_MONTH, D.BEG_OF_MTH_WID, P.BRAND, SUM(F.REVENUE) AS F01_REVENUE, SUM(F.UNITS) AS F02_BILLED_QTY FROM SAMP_REVENUE_F F JOIN SAMP_PRODUCTS_D P ON (F.PROD_KEY = P.PROD_KEY) JOIN SAMP_TIME_DAY_D D ON (F.BILL_DAY_DT = D.CALENDAR_DATE) GROUP BY D.CAL_MONTH, D.BEG_OF_MTH_WID, P.BRAND ORDER BY D.BEG_OF_MTH_WID, P.BRAND;
The above SQL requires the knowledge of:
- Aggregation methods
- Group by
Even if this is an oversimplification of the analytic view usage you can already spot that some knowledge of the base data structure and SQL language is needed.
Using the analytic views defined above, the query can be written as
SELECT TIME_HIER.MEMBER_NAME AS TIME_SLICE, PRODUCT_HIER.MEMBER_NAME AS PRODUCT_SLICE, F1_REVENUE, F2_BILLED_QTY FROM F0_SALES_BASE_MEASURES WHERE TIME_HIER.LEVEL_NAME IN ('CAL_MONTH') AND PRODUCT_HIER.LEVEL_NAME IN ('BRAND') ORDER BY TIME_HIER.HIER_ORDER, PRODUCT_HIER.HIER_ORDER;
As you can see, there is a simplification of the SQL statement: no more aggregation, joining conditions and group by predicates are needed. All the end-user has to know is the analytical view name, and the related hierarchies that can be used.
The additional benefit is that if we want to change the level of granularity of the above query we just need to change the
WHERE condition. E.g. to have the rollup per Year and LOB we just have to substitute
WHERE TIME_HIER.LEVEL_NAME IN ('CAL_MONTH') AND PRODUCT_HIER.LEVEL_NAME IN ('BRAND')
WHERE TIME_HIER.LEVEL_NAME IN ('CAL_YEAR') AND PRODUCT_HIER.LEVEL_NAME IN ('LOB')
without touching granularity, group by and order by statements.
Using Analytic Views in DVD
At the beginning of my blog post I wrote that Analytic Views could be useful when used in conjunction with self-service BI tools. Let's have a look at how the end user journey is simplified in the case of Oracle's Data Visualization Desktop.
Without AV the end-user had two options to source the data:
- Write the complex SQL statement with joining condition, group and order by clause in the SQL editor to retrieve data at the correct level with the related dimension
- Import the fact table and dimensions as separate datasources and join them together in DVD's project.
Both options require a SQL and joining conditions knowledge in order to being able to present correct data. Using Analytic Views the process is simplified. We just need to create a new source pointing to the database where the analytic views are sitting.
Next step is retrieve the necessary columns from the analytic view. Unfortunately analytic views are not visible from DVD object explorer (only standard table and views are shown)
We can however specify with a simple SQL statement all the informations we need like Time and Member Slice, the related levels and the order in hierarchy.
SELECT TIME_HIER.MEMBER_NAME AS TIME_SLICE, PRODUCT_HIER.MEMBER_NAME AS PRODUCT_SLICE, TIME_HIER.LEVEL_NAME AS TIME_LEVEL, PRODUCT_HIER.LEVEL_NAME AS PRODUCT_LEVEL, TIME_HIER.HIER_ORDER AS TIME_HIER_ORDER, PRODUCT_HIER.HIER_ORDER AS PRODUCT_HIER_ORDER, F1_REVENUE, F2_BILLED_QTY FROM F0_SALES_BASE_MEASURES ORDER BY TIME_HIER.HIER_ORDER, PRODUCT_HIER.HIER_ORDER;
You may have noted that I'm not specifying any
WHERE clause for level filtering: as end user I want to be able to retrieve all the necessary levels by just changing a filter in my DVD project. After including the above SQL in the datasource definition and amending the measure/attribute definition I can start playing with the analytic view data.
I can simply include the dimension's
MEMBER_NAME in the graphs together with the measures and add the
LEVEL_NAME in the filters. In this way I can change the graph granularity by simply selecting the appropriate
LEVEL in the filter selector for all the dimensions available.
One particular to notice however is that all the data coming from various columns like date, month and year are "condensed" into a single
VARCHAR column. In case of different datatypes (like date in the time dimension) this will prevent a correct usage of some DVD's capabilities like time series or trending functions. However if a particular type of graph is needed for a specific level, either an ad-hoc query or a casting operation can be used.
In this blog post we analysed the Analytic Views, a new component in Oracle Database 12.2 and how those can be used to "move" the metadata modeling at DB level to provide an easier query syntax to end-users.
Usually metadata modeling is done in reporting tools like OBIEE that offers additional set of features on top of the one included in analytic views. However centralized reporting tools like OBIEE are not present everywhere and, with the wave of self-service BI tools, analytic views represent a perfect method of enabling users not familiar with SQL to simply query their enterprise data.
If you are interested in understanding more about analytic views or metadata modeling, don't hesitate to contact us!
If you want to improve the SQL skills of your company workforce, check out our recently launched SQL for beginners training!