Oracle BI EE 10.1.3.4.1 – Sub-Totals & Pivot Calculations
November 17th, 2009 by Venkatakrishnan J
One intriguing or not-so obvious feature of BI EE is the way it does its Sub-Totals & Pivot Table calculations. As with any reporting tool, we expect almost every part of a report to be pushed back to the database in the form of native SQL. But there are certain cases wherein we cannot push an entire report in the form of SQL (for example Sub-Totals & Pivot table aggregations). In such cases, BI Server does these calculations in its own memory. Lets look at what this “in-memory calculation” means and how BI Server treats totals & Pivot table calculations. Since this is not documented anywhere, this behavior can be different across different releases (though i think it will work the same in all 10.1.3.* releases)
Sub-Totals:
To demonstrate how Sub-Totals work, lets start with a very simple report shown below

As you see, this table view report has a Grand Total. The question is how does BI Server create this Total. Does it bring the entire data into BI Server memory and then does the sub-total or does it push that back in SQL or does it write to any temporary file.
To understand this, lets start with looking at the logical SQL in the advanced tab of this report.
SELECT Products.PROD_CATEGORY saw_0, Channels.CHANNEL_DESC saw_1, Sales.AMOUNT_SOLD saw_2 FROM "SH - Training Days - Relational" ORDER BY saw_0, saw_1
As you see this logical SQL does not actually represent the true logical SQL since the aggregation or the total part of the query is not anywhere in this SQL. So, to get the exact SQL we have to go to the Manage->Sessions & extract the logical SQL

SELECT Products.PROD_CATEGORY saw_0, Channels.CHANNEL_DESC saw_1, Sales.AMOUNT_SOLD saw_2, REPORT_SUM(saw_2 BY ) FROM "SH - Training Days - Relational" ORDER BY saw_0, saw_1
If you notice, there will be a new function called REPORT_SUM which will be passed to the logical SQL. REPORT_SUM is an aggregation function that BI Server uses to calculate the total (since we had chosen Report Based Total whenever applicable in the table view). One good thing is we can take the exact SQL and fire it in the logical SQL window

As you see BI Server calculates the total as a column based total. There are quite a few functions like REPORT_SUM. I will list some of the common ones here
1. SUM
2. AGGREGATE
3. MIN
4. MAX
5. COUNT
6. COUNTDISTINCT
After calculating the column based total, BI Server will get the first record value in the Report Sum column and display it as Sub-Total. Now that we know what BI Server does for calculating the sub-totals in a table view, lets look at how the BI Server pushes the REPORT_SUM function. If you look at the physical query for the above logical SQL, you will notice that the report level total is not pushed back to the database. So, the question is how BI EE does this total. To determine this lets look at the {OracleBIData}/tmp folder. You will notice that for every query that has a sub-total (with database cache and presentation services cache disabled) BI EE will create temporary files in this tmp directory. The writing of temp files to this directory is governed by 4 settings in the NQSConfig.ini
WORK_DIRECTORY_PATHS = "C:\Oracle\OracleBIData\tmp"; SORT_MEMORY_SIZE = 4 MB ; SORT_BUFFER_INCREMENT_SIZE = 256 KB ; VIRTUAL_TABLE_PAGE_SIZE = 128 KB ;
So, when we created the above report, BI Server had created a set of temp files in the above directory. These tmp files are actually similar to the cache files but are stored seperately. But how they are handled is unfortunately not documented (though we can arrive at certain conclusions by trying various scenarios). In some complex reports, when the sub-total calculation involves a lot of intermediate rows, you will notice a sizeable increase in the size of the tmp files

Pivot Tables:
Pivot table is probably the most widely used Answers component. Though widely used, many a times it can eventually lead to performance bottlenecks due to a lot of in-memory calculations. Most Pivot calculations are done in the BI Server memory and are not pushed back to the database(though we can control where it is executed to an extent). As a generic rule, in any implementation, if the repository is correctly designed, almost 90 to 95% of the reports can be achieved using normal table views(unless all the reports require a specific feature like Sub-Total at the top etc). The most common reason why pivot table is quite popular is the fact that it offers flexibility to end users in slicing/dicing the data. It is treated as an alternative to a multi-dimensional reporting solution.Pivot table can sometimes hog the temp space so much that even cache writing process might fail due to memory allocation errors. For example, lets take the above report and convert it into a pivot table as shown below

If you notice, all i have done is i have moved a couple of columns to the Excluded section of the Pivot table. If you look at the Physical query, the SQL will still contain Product Id and Channel Desc even though they are excluded.
select T24112.PROD_ID as c2,
T24112.PROD_CATEGORY as c4,
T26412.CHANNEL_ID as c6,
sum(T24170.AMOUNT_SOLD) as c8,
sum(T24170.QUANTITY_SOLD) as c9
from
CHANNELS T26412,
PRODUCTS T24112,
SALES T24170
where ( T24112.PROD_ID = T24170.PROD_ID and T24170.CHANNEL_ID = T26412.CHANNEL_ID )
group by T24112.PROD_CATEGORY, T24112.PROD_ID, T24170.CHANNEL_ID, T26412.CHANNEL_ID
order by c6
So what this means is BI Server will do all the aggregations in its own memory or by writing to tmp files. If we look at the tmp file directory

As you see, a sudden 59KB file has come into the tmp file directory due to the amount of custom calculations that the BI Server has to do in its memory. The size of the files directly are related to the amount of calculations & the number of cells retrieved. So, whenever you face an issue like a Write to Tmp directory error etc, then the first place to look at is how much of calculations are being done by the BI Server. Also, as much as possible try to reduce the amount of calculations to be done by the pivot table. As more users start accessing the system, more will the number of tmp files and hence more will be the I/O. This is something to take notice of while creating Pivot Table reports.

November 18th, 2009 at 8:42 am
Hi Venkat,
The temporary files can be really big (gigabytes) when working with complex large pivot tables. Another thing to note that it seems that when running Answers these temporary files are generated on the server where Presentation Services are located and when running Scheduler files are generated on the server where Scheduler is running. One small thing to consider when planning the architecture and disk sizes of each server.
Then there is also a bug in Linux/Unix version of 10.1.3.4, which limits the temporary files to 2GB. More info can be found here for example: http://obiee-blog.info/bugs-and-issues/nqserror-46073-analytics-temp-files-cannot-be-larger-than-2gb/
BR,
Kimmo
November 18th, 2009 at 8:45 am
Forgot to say that OBIEE doesn’t clear out the temporary files if the maximum file size was reached or there was some other error (e.g. user cancelled the query) when generating the temporary files. Files are only cleared when restarting the services or with your own custom scripts.
BR,
Kimmo
November 18th, 2009 at 8:49 am
@Kimmo – Thats correct. I have seen temp files going in the range of 3 to 5 GB with a single user working on a set of pivot tables. Unfortunately there is no compression that BI EE does while writing to the disk. Apart from compression, this I/O activity can take significant amount of time while working on a set of big pivot tables. This is in addition to the SQL time. Thats the reason why i always recommend restricting the use of Pivot tables. The functionality in most cases can be achieved by using normal table view.
-Venkat
November 25th, 2009 at 9:21 am
I’ve also noticed that OBIEE generates a 2nd physical SQL statement just to calculate the Pivot table totals. This causes problems in cases such as showing the true Sales Value / 1000000 for instance. Due to rounding the pivot table rows will never reconcile with the Total. You can obviously debate that they are both correct but from a user perspective it is totally confusing. Any ideas why OBIEE sometimes generates a second query v’s the approach you describe above? Oracle Support can’t help! I had a pivot table which did not generate 2 queries, I copied it to a new Request and made one or 2 small mods and the new one started generating 2 queries!
November 25th, 2009 at 9:35 am
@Steve – It is very much possible that Pivot tables can generate another physical SQL for calculating the Sub-Totals. This is to an extent controlled by the “Report Based Sub-totals wherever possible” setting. Unfortunately we cannot exactly say when BI Server will switch between temp file based sub-totals to another physical SQL when the above setting is turned on. But in most cases, we can control whats happening through a combination of repository design and report level aggregations.
November 26th, 2009 at 8:09 am
“As a generic rule, in any implementation, if the repository is correctly designed, almost 90 to 95% of the reports can be achieved using normal table views”
I have seen the opposite, whereby 95% of reports were pivottables (this was not using OBIEE). The customer required a view of all offices, subtotalled by region BY time i.e weeks or months. So a Pivottable was required for each measure(report) with Region and Office down the left and Time across the top. We are considering using OBI but it seems we will hit performance issues unless this type pf report can be created as a table or we use OLAP?
November 26th, 2009 at 10:26 am
@Harry – Well i have seen the same case even with BI EE implementations as well. The problem is end users are comfortable with pivot tables since they are similar to Excel. The example that you have pointed out where we require sub-totaled measures by Time (wks or months) etc are all possible using custom Sum/Max/Min/etc By calculations even in a normal table view. The approach i always use is to start with table view. I think of all possible options to create the reports using Table View itself. If the report layout is so complex that a table view cannot be used, only then will i go in for a Pivot View.