I'm just working through the Data Modelling part of my upcoming seminar, and I'm at the point where I'm thinking about logical and physical modelling within Warehouse Builder 10.2. Warehouse builder seems to mix up the concepts of logical and physical modeling, and therefore I thought it worthwhile to work through what I'm thinking, and perhaps get some input from other OWB users.
Most people's understanding of logical and physical data warehouse data modeling goes along these lines:
- Your logical warehouse model consists of entities, relationships, keys, items (or columns) and so forth, is usually concerned with the normalized view of the warehouse (usually the staging or ODS layer), and is held in either spreadsheets or a tool such as Oracle Designer. You might also annotate your model with useful information ("metadata") and perhaps business rules, although this will usually go in the functional specification.
- You might also have a logical model of the dimensional part of your warehouse, which contains elements such as dimensions, hierarchies, levels, attributes, measures and so on. It is product-neutral and again might be stored in Excel or Designer.
- The physical model, usually automatically derived from the logical model, contains the same items, perhaps denormalized, but with physical attributes - datatype, index details, partitioning details, storage clauses and so on - added on. This will certainly cover both the normalized and dimensional views of the warehouse, and it may in fact be the first point at which dimensional objects are mentioned.
Now with warehouse builder, the physical and logic models of the warehouse appear to be, to my mind, mixed-up together. One way you could look at it would be that the logical model is what's in the OWB repository, whilst the physical model is the result of the logical model's deployment - but that doesn't give you a phyiscal model as such, just a logical model and the results of it's deployment.
In fact the model that's in the OWB repository contains both logical and physical elements. It contains metadata and business rules (a.k.a. "data rules") which could certainly be considered as part of the logical model, and OLAP constructs that are held at a level of abstraction above the physical implementation, to support the ability to deploy a dimension or cube to either relational or multi-dimensional storage. And yet, what the OWB repository looks most to me like is a model of the physical implementation of the warehouse with elements of the logical model thrown in, but to work it through it's probably worth walking through some examples of how objects are represented in the repository, to see if this idea holds water.
A first example is where the physical definition of objects in the OE schema are imported into a data source model. In this case, the objects from the OE schema appear in a data source module, and they're clearly a straight representation of the physical implementation of the objects, i.e. this is a physical database model.
There's no entities, relationships and so on. It's a straight physical model.
Now, say we do some data profiling on the OE schema, carry out some corrections and define some data rules. A data rule is a logical construct in Warehouse Builder that translates into a physical check constraint on the physical table, and some instructions to OWB to wrap up the table, whenever it is mapped to, in a construct that deals with rows that fail the data rule. So, our model now contains logical information - the data rule - together with the physical table design. It also contains functional information - a mapping to implement a correction based on this data rule - and some additional tables that the mapping will use when processing the correction.
Next, we define a dimension that takes it's data from the CUSTOMERS table. Some people would say that a dimension is a physical implementation of the logical object, Customer, whilst others would argue that the dimension object is a logical object in it's own right. When we come to create the customer dimension, it sits under the database node along with tables, cubes, external tables and sequences, which would suggest that it's a physical object that's held alongside the other physical objects in what must be the physical model of the warehouse.
And yet - when you view the dimension in the Data Object Editor, you can change the deployment type for the object from relational to multi-dimensional, which suggest that the object we're dealing with is a logical representation of the dimension, with the physical implementation being hidden away internally so as to present a unified approach to the developer. I can see why this has been done - to keep away from the developer the complexities of an OLAP deployment - but it does mean that OLAP objects in this model are logical representations, not physical (although you can specify details of the physical implementation, such as the AW name and tablespace within this logical model, once you've chosen a storage method).
So my conclusion is, that OWB is a physical database modeler, with certain details abstracted out (data rules, OLAP storage) to make life "easier" for the developer. It's not really a logical modeller, although you could consider the representation of the warehouse in the OWB repository as being the logical model, although I'm sure a user of Designer would disagree.
Does anyone else have any views on this, perhaps a different perspective? Add a comment if you have.