Resolving Fan Traps and Circular Joins using OBIEE

If you're looking to migrate data from a Discoverer system to Oracle Business Intelligence Enterprise Edition, one of the first things you'll probably end up thinking about is how your (not necessarily star schema-based) End User Layer will translated to a (star schema mandatory) Common Enterprise Information Model. In particular, Discoverer allows tables to have more than one join route between them, and it also allows circular joins so that, for example, an employee can belong to a department, which belongs to a region, which has an employee as the regional manager.

Circularjoin1-2

If you try and bring this arrangement in to the Physical Layer of your OBIEE semantic model, the administration tool will initially allow this and your repository verifies as being correct.

Circularjoins3

However if you then drag and drop the tables into the logical business model layer, all three lookup tables are shown as dimension tables, as you would expect,

Circularjoins4

and the BI Administrator tool shows an error message when you try and verify the repository, saying that circular joins aren't supported.

Circularjoins5

In reality, you don't want three dimension tables (departments, regions, employees) joining to the fact table (appointments), you just want one, so you're going to have to use the mapping ability in the business model and mapping layer to join these three lookup tables together under one logical table source. For the employee table though, rather than join it through this circular join, you need to create an alias for it in the physical layer (in my case I call it CIRC_JOIN_REGEMP) like this:

Circularjoin6

and join the region table to this rather than the employees table.

Circularjoin7

Then you can create a single logical dimension table within your logical business model, and map the logical table source to the join between the four lookup tables, including your new alias for the employee table.

Circularjoins8

When you add the new columns into the logical table and then run a report, the figures come up as expected, with the regional manager details displayed correctly, the alias having resolved the circular join.

Circjoin11

Another issue that often comes up with Discoverer is "Fan Traps". Fan traps are an oddity of SQL where, if you have say a dimension table full of account details, and two facts that join to it, say with sales figures in one and budget figures in the other, logically if you query the two together you should get a set of results like this, taken from the Oracle document "Oracle Discoverer's Fan Trap Resolution - Correct Results Every Time"

Fantrap1

Fantrapexpectedres

The problem is though, that if you issue the obvious query in SQL, the results come back wrong, because SQL does a cartesian join between the two fact tables and over-counts the figures.
SQL> select a.acc_name,
2  sum(sales),
3  sum(budget)
4  from fan_trap_accounts a, fan_trap_sales s, fan_trap_budget b
5  where a.acc_id = s.acc_id
6  and   a.acc_id = b.acc_id
7  group by a.acc_name
8  order by a.acc_name
9  /

ACC_NAME SUM(SALES) SUM(BUDGET)


ACCOUNT1 900 1050
ACCOUNT2 130 200
ACCOUNT3 600 750
ACCOUNT4 600 600

Now Discoverer handles this quite well, and automatically rewrites the query internally to use two in-line views, one to join the sales and accounts tables, and one to join the budget and accounts tables, and then join the results together with the correct aggregation applied. So how does OBIEE handle this? Does it ignore the potential error and give you the wrong figures, or does it even disallow this sort of schema design in the first place? Let's take a look.

The BI Administrator tool doesn't have a problem with this sort of arrangement in the physical layer:

Fantrap2

The logical business model is fine as well, with the sales and budget tables being displayed as fact tables, and the accounts table as a dimension, as you would expect.

Fantrap3

In fact, when you run a query, using the same selection as our SQL statement, the figures do in fact come out correct.

Fantrap4-1

So what's happening? Well, if you take a look at the query log and see how the logical, and physical, SQL queries are put together, you can see that OBIEE is doing what Discoverer does, but within the BI Server rather than writing a single SQL statement that uses inline views. The BI Server effectively creates two inline views that in turn lead to two physical SQL statements being sent to Oracle, the BI Server then joins them together and returns the data back as a single result set.
-------------------- Execution plan:

RqBreakFilter <<2750>>[1] [for database 0:0,0]
RqList <<2641>> [for database 0:0,0]
case when D903.c1 is not null then D903.c1 when D903.c2 is not null then D903.c2 end as c1 GB [for database 3023:5710,44],
D903.c3 as c2 GB [for database 3023:5710,44],
D903.c4 as c3 GB [for database 3023:5710,44]
Child Nodes (RqJoinSpec): <<2752>> [for database 0:0,0]
(
RqList <<2795>> [for database 0:0,0]
D902.c1 as c1 [for database 3023:5710,44],
D901.c1 as c2 [for database 3023:5710,44],
D901.c2 as c3 GB [for database 3023:5710,44],
D902.c2 as c4 GB [for database 3023:5710,44]
Child Nodes (RqJoinSpec): <<2798>> [for database 0:0,0]

                (
                    RqList &lt;&lt;2653&gt;&gt; [for database 3023:5710:ora11g,44]
                        FAN_TRAP_ACCOUNTS.ACC_NAME as c1 GB [for database 3023:5710,44],
                        sum(FAN_TRAP_SALES.SALES by [ FAN_TRAP_ACCOUNTS.ACC_NAME] ) as c2 GB [for database 3023:5710,44]
                    Child Nodes (RqJoinSpec): &lt;&lt;2686&gt;&gt; [for database 3023:5710:ora11g,44]
                        FAN_TRAP_ACCOUNTS T6678
                        FAN_TRAP_SALES T6685
                    DetailFilter: FAN_TRAP_ACCOUNTS.ACC_ID = FAN_TRAP_SALES.ACC_ID [for database 0:0]
                    GroupBy: [ FAN_TRAP_ACCOUNTS.ACC_NAME]  [for database 3023:5710,44]
                    OrderBy: c1 asc [for database 3023:5710,44]
                ) as D901 FullOuterStitchJoin &lt;&lt;2744&gt;&gt; On D901.c1 =NullsEqual D902.c1; actual join vectors:  [ 0 ] =  [ 0 ]

                (
                    RqList &lt;&lt;2690&gt;&gt; [for database 3023:5710:ora11g,44]
                        FAN_TRAP_ACCOUNTS.ACC_NAME as c1 GB [for database 3023:5710,44],
                        sum(FAN_TRAP_BUDGET.BUDGET by [ FAN_TRAP_ACCOUNTS.ACC_NAME] ) as c2 GB [for database 3023:5710,44]
                    Child Nodes (RqJoinSpec): &lt;&lt;2723&gt;&gt; [for database 3023:5710:ora11g,44]
                        FAN_TRAP_ACCOUNTS T6678
                        FAN_TRAP_BUDGET T6681
                    DetailFilter: FAN_TRAP_ACCOUNTS.ACC_ID = FAN_TRAP_BUDGET.ACC_ID [for database 0:0]
                    GroupBy: [ FAN_TRAP_ACCOUNTS.ACC_NAME]  [for database 3023:5710,44]
                    OrderBy: c1 asc [for database 3023:5710,44]
                ) as D902
    ) as D903
OrderBy: c1 asc [for database 0:0,0]

+++Administrator:2a0000:2a0001:----2008/08/26 17:24:48

-------------------- Sending query to database named ora11g (id: <<2653>>):

select T6678.ACC_NAME as c1,
sum(T6685.SALES) as c2
from
CUST_ORDER_HISTORY.FAN_TRAP_ACCOUNTS T6678,
CUST_ORDER_HISTORY.FAN_TRAP_SALES T6685
where ( T6678.ACC_ID = T6685.ACC_ID )
group by T6678.ACC_NAME
order by c1

+++Administrator:2a0000:2a0001:----2008/08/26 17:24:48

-------------------- Sending query to database named ora11g (id: <<2690>>):

select T6678.ACC_NAME as c1,
sum(T6681.BUDGET) as c2
from
CUST_ORDER_HISTORY.FAN_TRAP_ACCOUNTS T6678,
CUST_ORDER_HISTORY.FAN_TRAP_BUDGET T6681
where ( T6678.ACC_ID = T6681.ACC_ID )
group by T6678.ACC_NAME
order by c1

Now this strictly speaking isn't as efficient as Discoverer's approach, which uses a single SQL statement and does the join of the two inline views within the database, but it's still pretty clever and would work for platforms that don't support inline views. In a way its an "on the fly" creation of two logical table sources without you having to set this up explicitly in the logical business model.

One thing I haven't tried out yet is how OBIEE handles queries that, in Discoverer, would cause a fan-trap detection error, i.e. queries that it can't rewrite to use two or more inline views. I suspect that these sorts of scenarios, that usually involve facts of differing granularity, where circular joins exist or where the keys to the tables don't correspond would get picked up by the general data modeling rules in OBIEE, but if anyone's got any experiences with this then by all means leave a comment.

If you're wondering what prompted this posting, I'm putting the finishing touches to my Oracle Open World presentation on OBIEE data modeling, and looking to cover off a few data modeling "oddities". I'll try and cover support for ragged/unbalanced/value-based hierarchies in a posting tomorrow, and finish up with how the logical business model supports the Kimball concept of Bridge Tables, but no-one seems to use them.