It seems like a different life when I posted the introduction to reporting against transactional schemas. Since that time, we've had some exciting new customers come aboard, which has required a lot of my time (and which I have been glad to provide). We've also had Kscope 12, where I was a member of the BI content committee, and where Rittman Mead was a sponsor, and our own Jordan Meyer won Best Speaker for the BI Stream. And, we've opened our brand new office in Atlanta, and have our first public training event at that location in progress. All together... we've been busy at Rittman Mead America. But a promise is a promise... so here goes Part 2.
The first thing we have to approach is the intelligent use of alias tables in OBIEE. There's a best practice nugget out in the community that espouses "alias everything". You've heard this, right? It's not unlike the same advice I've been hearing in the database world now for going on two decades, that tells us to always use synonyms, or views, or provide some layer of abstraction. Abstraction I completely understand... but we already have two layers of abstraction in OBIEE without using a single alias. I think the argument goes something like this: "You have no idea when you'll need to introduce a new way of accessing a particular table... or when it will need to play multiple roles... so do yourself a favor and alias everything right out of the gate."
My response to this advice: maybe. The longer I've worked in IT in general and BI in particular, the more I treat any "absolute" commandment with suspicion. When reporting against a transactional system... we have a very complex set of tables and joins. Where dimensional models are characteristic of sparse normalization, OLTP schemas are spilling over the brim with it. And if we choose to "alias everything", as the good book says, then OBIEE's ability to import foreign key relationships, and therefore, joins, is completely wasted. We will have to go through the process of re-applying all those joins that we just imported because we want to replace our entire physical layer with aliases. So whatever flexibility you gain by aliasing everything... remember that it comes with a price: the need to redefine every single join once again.
(Do you have a different opinion? Please let me know... I'd love to hear about it. Even you Venkat!)
Regardless of where you stand on the subject of mandatory aliasing... there will always be a need to use some aliasing when dealing with OLTP. To understand why, let's first take a look at our entire model for the Customer Tracking application:
The first thing to notice is the relationship between EBA_CUST_CONTACTS and EBA_CUST_CUSTOMERS. Basically... this relationship is the heart of the application. Being able to add Customers as well as Contacts, and apply relationships between them (and perhaps sever them if necessary) is one of the key capabilities of any good CRM application. But when looking at this model, how would I write a query to display all the Contacts by Customer? I seem to have two options:
SQL>; select count(*) from eba_cust_associations eca 2 join eba_cust_customers ecu on eca.customer_id = ecu.id 3 join eba_cust_contacts eco on eca.contact_id = eco.id; COUNT(*) ---------- 6 1 row selected. Elapsed: 00:00:00.02 SQL>; select count(*) from eba_cust_links ecl 2 join eba_cust_customers ecu on ecl.customer_id = ecu.id 3 join eba_cust_contacts eco on ecl.contact_id = eco.id; COUNT(*) ---------- 0 1 row selected. Elapsed: 00:00:00.03 SQL>;
To the human eye, it's perfectly clear that the initial query is the correct one. It's clear by looking at the table name EBA_CUST_ASSOCIATIONS that this table serves as the cross-reference table between Customers and Contacts. EBA_CUST_LINKS is actually a place for recording important URL's, and they can be stored against either a Contact or a Customer. In this way, the links table is "reused", and that is why it joins to both EBA_CUST_CUSTOMERS and EBA_CUST_CONTACTS.
But the BI Server has no way of knowing this. To the electronic eye, both EBA_CUST_ASSOCIATIONS and EBA_CUST_LINKS are attractive cross reference tables.
So we have to tell the BI Server how this model plugs together from an "analytics" perspective, and to do this, we'll use aliases. Once I have created the aliases for the two different "roles" that the EBA_CUST_LINKS table plays, it is very clear to the BI Server which route to take between tables.
Up next in Part 3, I'll talk about putting together logical dimension tables, and some of the design patterns to keep in mind. Also, when we discuss building logical fact tables (Part 4, perhaps?), we'll discuss aliases again, and again... I'll show you common situations where aliases are overused.