OBIEE 12c Time Dimension: Logical Sequence Number

The key component of any successful OBIEE implementation is the metadata model known as the repository (RPD). Doing it right is sometimes considered "black magic", and small mistakes in the RPD can impact all the exposed Subject Areas, resulting in poor performances or, even worse, wrong results.

Working an RPD requires dedicated knowledge of the tool and we are sharing it in our RPD modelling training both for OBIEE 11g and OBIEE 12c.

If you ever worked on RPD modelling, one of the settings you surely encountered is the Time dimension. This blog post written back in 2007 explains the process of setting up a time dimension for OBIEE 10g. The process didn't have any major modifications until recently when, in 12.2.1, Logical Sequence Numbers were introduced. As per Oracle's documentation this new feature "optimizes time series functions and in some cases improves query time", and in this post we'll see how to configure it and its impact on the time-series calculations. The examples shown below are based on Oracle Sampleapp v607, a really good source of modelling and front-end examples.

Usual Time-series Query Behaviour

Time-series functions like Ago,ToDate, and more recently PeriodRolling, allow end users to compare results coming from different moments just by specifying the level in the time dimension hierarchy and the number of periods to look backwards or forwards. As example if you needed to compare current month sales revenue with the previous month figure you'll end up writing a formula like

 AGO("F0 Sales Base Measures"."1- Revenue","H0 Time"."Month", 1)

Where:

  • AGO: is the Time-series function being called
  • "F0 Sales Base Measures"."1- Revenue": is the metric
  • "H0 Time"."Month": is the time hierarchy level
  • 1: is the amount of periods (months in our case) to look back in history

Once the time-series metric has been created, it can be used in an analysis like the following to compare Revenue of 2015-04 with the one of the previous month.

Front End Analysis

The analysis generates the following Logical SQL which basically lists the columns retrieved and the filters applied.

SELECT  
   0 s_0,
   "A - Sample Sales"."Time"."T02 Per Name Month" s_1,
   "A - Sample Sales"."Base Facts"."1- Revenue" s_2,
   "A - Sample Sales"."Time Series"."127  Mth Ago Rev  (Fix Time Lvl)" s_3
FROM "A - Sample Sales"  
WHERE  
("Time"."T02 Per Name Month" = '2015 / 04')
ORDER BY 2 ASC NULLS LAST  
FETCH FIRST 5000001 ROWS ONLY  

The translation to source SQL hugely depends on the data model created in the RPD and on the data source type. In our example an Oracle SQL gets generated containing the following steps:

  • Sequence Generation: a RANK function is used to created a dense sequence based on Per_Name_Month, the chronological key defined in the time-hierarchy for the month level. Mth_Key is also part of the query since it's used in the join between dimension and fact table.
WITH  
OBICOMMON0 AS (select DENSE_RANK() OVER ( ORDER BY T653.Per_Name_Month) as c1,  
     T653.Mth_Key as c2,
     T653.Per_Name_Month as c3
from  
     BISAMPLE.SAMP_TIME_MTH_D T653 /* D02 Time Month Grain */ ),
  • Period Shifting: The sequence generated above is now shifted by the number of periods defined in the column formula (D1.c1 + 1), in our example 1.
SAWITH0 AS (select D1.c1 + 1 as c1,  
     D1.c2 as c2,
     D1.c3 as c3
from  
     OBICOMMON0 D1),
  • Ago Period Query: Using the period shifted query the historical record (or set of records) is retrieved.
SAWITH1 AS (select distinct D1.c1 as c1,  
     D1.c3 as c2
from  
     OBICOMMON0 D1),
SAWITH2 AS (select sum(T418.Revenue) as c1,  
     D3.c2 as c2
from  
     BISAMPLE.SAMP_REVENUE_F T418 /* F10 Billed Rev */ ,
     SAWITH0 D4,
     SAWITH1 D3
where  ( T418.Bill_Mth_Key = D4.c2 and D3.c1 = D4.c1 and D3.c2 = '2015 / 04' )  
group by D3.c2, D4.c3),  

The period shifted query usage is explained visually by the image below

Sequence shift in usage

  • Selected Period Query: the query for the selected period, in our case 2015-04, is executed using standard time dimension
SAWITH3 AS (select sum(T418.Revenue) as c1,  
     T653.Per_Name_Month as c2
from  
     BISAMPLE.SAMP_TIME_MTH_D T653 /* D02 Time Month Grain */ ,
     BISAMPLE.SAMP_REVENUE_F T418 /* F10 Billed Rev */ 
where  ( T418.Bill_Mth_Key = T653.Mth_Key and T653.Per_Name_Month = '2015 / 04' )  
group by T653.Per_Name_Month)
  • Resultsets joining: Results coming from Ago Period and Selected Period queries are then joined with an outer join.
select D1.c1 as c1, D1.c2 as c2, D1.c3 as c3, D1.c4 as c4 from ( select D1.c1 as c1,  
     D1.c2 as c2,
     D1.c3 as c3,
     D1.c4 as c4
from  
     (select 0 as c1,
               coalesce( D1.c2, D2.c2) as c2,
               D2.c1 as c3,
               D1.c1 as c4,
               ROW_NUMBER() OVER (PARTITION BY coalesce( D1.c2, D2.c2) ORDER BY coalesce( D1.c2, D2.c2) ASC) as c5
          from 
               SAWITH2 D1 full outer join SAWITH3 D2 On D1.c2 = D2.c2
     ) D1
where  ( D1.c5 = 1 )  
order by c2 ) D1 where rownum <= 5000001  

As you can see, it's a rather complex set of instructions that involves several steps including analytical functions like the DENSE_RANK() in the sequence generation. In our case we have been lucky that the source system provided the DENSE_RANK() function; with other sources the sequence generation needs to be calculated directly by the BI Server (OBIS) with a possible degradation of performances.

What is Logical Sequence Number?

The Sequence Generation step mentioned above can sometimes be the bottleneck of the whole query especially when the time dimension is complex or huge in volume since DENSE_RANK() is a costly analytical function.

All that OBIEE's Time-series need to work is a pure sequence, or in Oracle's words:

Sequence numbers are enumerations of time dimensional members at a certain level. The enumeration must be dense (no gaps) and must correspond to a real time order. For example, months in a year can be enumerated from 1 to 12.

Then what if we can find a way of pre-calculating them and storing in the table or calculate them on the fly but using functions less expensive than a DENSE_RANK()?

This is the idea behind the Logical Sequence Number (LSN): a way of avoiding the BI Server (OBIS) needing to execute the DENSE_RANK(), by passing either a pre-calculated sequence column in the source table or a cheaper-cost function to calculate it on the fly based on existing columns.

The formula behind a Logical Sequence Number must resolve in a number (no varchar, or dates are allowed) and can either be:

  • Absolute: when the sequence is consistent and doesn't need any external reference, e.g. calendar years are a self-defined sequence
  • Relative: when the sequence is relative to a parent level, e.g. creating a sequence for months between 1 and 12 would need the calendar year as parent level to provide correct cross-years references

As the word already says "Sequence Numbers" must be sequential, no gaps can be included. Year and month number are good examples of this.

But what if we wanted to provide an absolute sequence number at month level?
One could think about using a month key in the YYYYMM format but this is not a sequence: 201701 <> 201612+1. The best way of generating the sequence number would be to add it as a column in the database table.

Will take as example the SAMP_TIME_MTH_D table that can be found in Sampleapp v607 containing MONTH_KEY column in the YYYYMM format.
Month Key

I'll add to SAMP_TIME_MTH_D a column MONTH_ABS_SEQ containing the absolute sequence number. The formula behind the column is exactly what Oracle was doing under the covers using a DENSE_RANK.

SELECT MTH_KEY, DENSE_RANK() OVER (ORDER BY MTH_KEY) MONTH_ABS_SEQ FROM SAMP_TIME_MTH_D  

And the end result as expected being

Absolute Sequence Number Month

How are Logical Sequence Number Used?

Previously I described the two types of Logical Sequence Numbers: Absolute and Relative, each one has it use case:

  • PERIODROLLING: This function uses absolute LSN to calculate the starting Sequence Number based on the selected one: e.g. looking at the previous image a 6 month rolling starting from 2009-02 will include data from sequences in the range [9-14].
  • TO_DATE: uses relative LSN, e.g A YearToDate Measure shown by month will use the month relative LSN to calculate the previous months in the same year.
  • AGO: Ago function uses both absolute and relative: Absolute numbers are used if the grain of the query and the one of the AGO are at the same level e.g. Yearly analysis of Actual vs Previous Year. However when the grain of the shift in the ago is higher than the grain of the query Relative LSN are used, e.g. Monthly analysis of Actual vs Previous Year.

How to Configure Logical Sequence Numbers

Logical Sequence Number is a new feature and as such it requires additional settings in the time dimension hierarchy to be working. In our case we'll add two sequences, an absolute at calendar year level and a relative at calendar month level.
We'll add the sequences directly as formulas in the repository however those formulas should be pushed down as columns in the database table if optimal performances are sought.

In order to set the Logical Sequence Numbers we need to open the RPD (the SampleApp one in this test), and select the Time dimension we want to change.
Time Hierarchy

After selecting any level apart from the Grand Total (top of the hierarchy) a new tab called "Sequence Numbers" should be visible. However if, like in our case, none of the columns at that level are integer or doubles, the sequence number selector is disabled.

SN Disabled

In order to enable the selector we first need to create a sequence column in our dimension and bring it at the correct level in the hierarchy. For the Year Level there is already an integer column named "T35 Cal Year" which can be used as sequence. We need simply to drag the column at Year level in the Time hierarchy and set it as Absolute Sequence.

SN Year Setting

I can do the same with the Month level in the hierarchy and the "T32 Cal Month" column. Note that the column contains only the months enumeration from 1 till 12 so we need to set the sequence number as Relative to the level Year.

SN Month Setting

Please note that both absolute and relative LSN can be (and should be) entered since as discussed above each have a different use cases. In addition relative LSN should be set for all logical parents level in the hierarchy since they will be used only if the grain of the time shift matches the one of the parent level. For example a Monthly LSN based on Year logical level will only be used in AGO functions having a year shift and not in case of Quarterly shift.

For an optimal usage every level of the time hierarchy should have one absolute and a relative LSN for each of the parents level in the hierarchy.

Impact on SQL

It's time now to review the SQL generated by our analysis and check the differences with the old-school time-series query.

When creating an analysis at year level like the following
Year Analysis

As Expected the BI Server (OBIS) uses the CAL_YEAR column as sequence instead of the DENSE_RANK() function over the PER_NAME_YEAR column.

WITH  
OBICOMMON0 AS (select T795.CAL_YEAR as c1,  
     T795.QTR_KEY as c2,
     T795.PER_NAME_YEAR as c3
from  
     BISAMPLE.SAMP_TIME_QTR_D T795 /* D03 Time Quarter Grain */ )

While when using the TO_DATE both the relative sequence is used, like in the following example where the measure "166 Revenue Year To Date" is defined by the formula:

TODATE("01 - Sample App"."F0 Sales Base Measures"."1- Revenue",  "01 - Sample App"."H0 Time"."Year" )  

To Date

The following query gets generated, note the usage of Cal_Year and Cal_Month in the sequence generation query instead of the DENSE_RANK() function as per RPD settings mentioned above.

WITH  
OBICOMMON0 AS (select T653.Cal_Year as c1,  
     T653.Cal_Month as c2,
     T653.Per_Name_Month as c3,
     T653.Per_Name_Year as c4
from  
     BISAMPLE.SAMP_TIME_MTH_D T653 /* D02 Time Month Grain */ )
...

Are Logical Sequence Numbers Useful?

Most of the times the bottleneck when using Time Series is not in the sequence generation, since the time dimension cardinality is rarely big enough to produce a noticeable delay in the query time, but rather in the poorly managed calculations made on query time on top of massive fact tables.

Don't expect LSN to solve all your performance problems with Time Series. However, the usage of Logical Sequence Numbers provides to OBI a way of pre-cooking part of the calculation and so in theory should help performance. The small effort required to set them up centrally in the time hierarchy is covered by the benefits during query time, without having to touch any pre-defined time-series calculation.

If you do have performance problems with your OBIEE system, or would like to ensure that a system you’re building will be performant from the outset, please get in touch to find out more about our Performance Analytics service!
We also provide expert OBIEE training, implementations, QA and health checks - to find out more about how we can help you, please contact us!

Subscribe to Rittman Mead

Get the latest posts delivered right to your inbox.

or subscribe via RSS with Feedly!

Subscribe to Our
Monthly Newsletter!

* indicates required