<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Rittman Mead Consulting &#187; Stewart Bryson</title>
	<atom:link href="http://www.rittmanmead.com/author/stewart-bryson/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.rittmanmead.com</link>
	<description>Delivering Oracle Business Intelligence</description>
	<lastBuildDate>Mon, 06 Feb 2012 21:18:16 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Interview with Kevin McGinley, BI Content Lead for Kscope 12</title>
		<link>http://www.rittmanmead.com/2012/01/kevin-mcginley-kscope12/</link>
		<comments>http://www.rittmanmead.com/2012/01/kevin-mcginley-kscope12/#comments</comments>
		<pubDate>Sun, 29 Jan 2012 15:13:53 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[BI (General)]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[User Groups & Conferences]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=10104</guid>
		<description><![CDATA[Recently, I sat down (virtually) with Kevin McGinley of Accenture to discuss the upcoming ODTUG Kscope 12. I was on the content selection committee, and immediately recognized how lucky ODTUG was to have Kevin coordinating this process. We had tough choices to make around content &#8212; this is always the case, as I&#8217;ve participated in [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright" src="http://odtug.files.wordpress.com/2012/01/kevin_206x211.jpg?w=144&amp;h=148" alt="" width="144" height="147" /></p>
<p>Recently, I sat down (virtually) with Kevin McGinley of Accenture to discuss the upcoming ODTUG Kscope 12. I was on the content selection committee, and immediately recognized how lucky ODTUG was to have Kevin coordinating this process. We had tough choices to make around content &#8212; this is always the case, as I&#8217;ve participated in this capacity before in the past. But Kevin always took us in the right direction, and after the process was over, I knew I wanted to have a discussion with him on the blog so our readers could see what awaits them at Kscope 12.</p>
<p>Kevin recently <a title="Looking forward to BI at Kscope 12!" href="http://odtug.wordpress.com/2012/01/26/looking-forward-to-bi-at-kscope-12/" target="_blank">blogged about Kscope 12 on the ODTUG Blog</a>, so perhaps that is a nice introduction to our interview here. I&#8217;d like to thank Kevin for taking a little time to do this interview, and I&#8217;d also like to thank Accenture for allowing him to appear here.</p>
<p><strong>[Stewart Bryson]</strong> This is only your second Kscope, but already you are a winner of the Editor&#8217;s Choice award for your whitepaper at Kscope 11, and now, are the BI content lead for Kscope 12. What do you think it is about ODTUG and Kscope that you have connected with?</p>
<p><strong>[<strong>Kevin </strong>McGinley] </strong>I was amazed by three things at Kscope 11.  First, the ODTUG community is a very warm, welcoming community of people who were very easy to engage with, both on a professional and personal level.  Second, I was pleased with the type of content presented at Kscope versus a larger conference like Open World.  The sessions feel very real, the presenters are very approachable, and the level of discussion/interaction is much higher.  Lastly, I was very impressed with the level of organization at Kscope.  The conference flowed very smoothly, there were a lot of interesting activities outside the core sessions, and the entertainment was top-notch.</p>
<p><strong>[Stewart Bryson] </strong>For those folks who have never attended Kscope before, how would you describe the event, perhaps drawing comparisons or differences with other conferences?</p>
<p><strong><strong>[<strong>Kevin </strong>McGinley]</strong> </strong>As I alluded to above, Kscope is much more communal than a larger conference like Open World.  Open World is a mad dash against 40,000+ strangers to get from place to place.  You are exhausted by the end of the week, and the practical knowledge you take away can be limited.  Kscope is a more manageable pace, the practical knowledge you gain from the sessions is much higher, and there is greater emphasis on interaction and discussion.</p>
<p><strong>[Stewart Bryson] </strong>Thinking specifically about the BI Stream, what would you say to Kscope Alumni about the BI Stream this year that might encourage them to give the conference another try?</p>
<p><strong><strong><strong>[<strong>Kevin </strong>McGinley]</strong></strong> </strong>I would say two things to this.  First, BI keeps growing at Kscope – we have about 50% more sessions than we did last year!  This is great because you get to offer more variety in the content and you also get to balance the “intro” audiences against the “technical” audiences – satisfying both.  Second, Kscope has a tremendous EPM presence – quite possibly the biggest EPM conference around – and with BI and EPM converging the way they are, this offers attendees a tremendous opportunity to start looking at how to maximize their Oracle investments in these two areas and expand the value they provide to their businesses.</p>
<p><strong>[Stewart Bryson] </strong>What can you tell us about the content selection process? Did you have a particular focus or goal in mind when selecting and scheduling the presentations?</p>
<p><strong><strong><strong>[<strong>Kevin </strong>McGinley]</strong></strong> </strong>Because OBIEE 11g was introduced before Kscope 11, it had a very strong presence that year due to the sheer magnitude of the release.  It was necessary to insure that the ODTUG community was well informed about OBIEE 11g.  Now that OBIEE 11g has settled in the marketplace, we can explore/return to other areas like the packaged BI Applications, data integration with ODI and Golden Gate, EPM integration, more BI Publisher, and the recently announced Exalytics.  We tried to make sure we still covered relevant areas of OBIEE, but left room to cover more of the Oracle offerings around OBIEE, since it’s rarely used by itself in a vacuum.</p>
<p><strong>[Stewart Bryson] </strong>Any particular BI sessions that you are looking forward to?</p>
<p><strong><strong><strong>[<strong>Kevin </strong>McGinley]</strong></strong> </strong>I see what you’re trying to do here, Stewart – you’re looking for me to plug your two presentations! In all seriousness, there are a lot of great sessions that I’m excited about.  I also love that we have a great balance between customer speakers, boutique consulting companies, large consulting companies, independents, and Oracle ACEs.  I think that’s important.  To answer your question, though, I’m really excited to hear from customers like JC Penny, Eaton Vance, General Dynamics, and Clark Construction covering topics like OBI/EPM integration, rolling-out mobility to executives, and project testing strategies.</p>
<p><strong>[Stewart Bryson] </strong>Being involved with content selection can be very time-consuming. How supportive has Accenture been with your dedication to Kscope?</p>
<p><strong><strong><strong>[<strong>Kevin </strong>McGinley] </strong></strong></strong>Accenture has been great.  I think no matter where you work, you’re often pretty busy, so it helps to have an employer who is supportive with the extra time required to make sure Kscope is a great experience for everyone.  Accenture really recognizes the value of a smaller, more intimate conference like Kscope – we host a similar conference for our Oracle customers – and encourages its employees to engage in the industry community where possible.</p>
<p><strong>[Stewart Bryson] </strong>Personally, I think Kscope provides a great opportunity to step outside my usual focus on BI and see some sessions in other streams. Last year I attended sessions on Exadata, PL/SQL development, and APEX. Has anything outside of the BI stream caught your eye?</p>
<p><strong><strong><strong>[<strong>Kevin </strong>McGinley]</strong></strong> </strong>The great thing about BI is that it complements other tracks nicely.  You can’t get very far in BI without a data store of some sort, so both the Database track and the Essbase track offer sessions that would be attractive to BI attendees.  Both data stores require optimization for BI to perform, and each track has very practical sessions on how to accomplish this.  I’m excited about that.  Another track I find interesting is the EPM Business Content, a new track this year.  Geared more towards a director or senior manager, this track can really help a BI person understand how EPM can fit into their environment and drive additional value.</p>
<p>As you can see, Business Intelligence is in good hands at Kscope 12. Hopefully, we&#8217;ll see you there!</p>
<div></div>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2012/01/kevin-mcginley-kscope12/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Agile Data Warehousing with Exadata and OBIEE: ETL Iteration</title>
		<link>http://www.rittmanmead.com/2012/01/agile-exadata-obiee-etl/</link>
		<comments>http://www.rittmanmead.com/2012/01/agile-exadata-obiee-etl/#comments</comments>
		<pubDate>Fri, 27 Jan 2012 04:22:31 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[BI (General)]]></category>
		<category><![CDATA[BI 2.0]]></category>
		<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Dimensional Modelling]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Oracle BI Suite EE]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=9954</guid>
		<description><![CDATA[This is the fourth entry in my series on Agile Data Warehousing with Exadata and OBIEE. To see all the previous posts, check the introductory posting which I have updated with all the entries in the series. In the last post, I describe what I call the Model-Driven iteration, where we take thin requirements from the [...]]]></description>
			<content:encoded><![CDATA[<p>This is the fourth entry in my series on Agile Data Warehousing with Exadata and OBIEE. To see all the previous posts, check the <a title="Agile Data Warehousing with Exadata and OBIEE: Introduction" href="http://www.rittmanmead.com/2011/12/agile-data-warehousing-with-exadata-and-obiee-introduction/">introductory posting</a> which I have updated with all the entries in the series.</p>
<p>In the last post, I describe what I call the Model-Driven iteration, where we take thin requirements from the end-user in the form of a user story and generate the access and performance layer, or our star schema, logically using the OBIEE semantic model. Our first several iterations will likely be Model-Driven as we work with the end user to fine-tune the content he or she wants to see on the OBIEE dashboards. As user stories are opened, completed and validated throughout the project, end users are prioritizing them for the development team to work on. Eventually, there will come a time when an end user opens a story that is difficult to model in the semantic layer. Processes to correct data quality issues are a good example, and despite having the power of Exadata at our disposal, we may find ourselves in a performance hole that even the Database Machine can&#8217;t dig us out of. In these situations, we reflect on our overall solution and consider the maxim of Agile methodology: &#8220;refactoring&#8221;, or &#8220;rework&#8221;.</p>
<p>For Extreme BI, the main form of refactoring is ETL. The pessimist might say: &#8220;Well, now we have to do ETL development, what a waste of time all that RPD modeling was.&#8221; But is that the case? First off&#8230; think about our users. They have been running dashboards for some time now with at least a portion of the content they need to get their jobs done. As the die-hard Agile proponent will tell you&#8230; some is better than none. But also&#8230; the process of doing the Model-Driven iteration puts our data modelers and our ETL developers in a favorable position. We&#8217;ve eliminated the exhaustive data modeling process, because we already have our logical model in the Business Model and Mapping layer (BMM).</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Full-Logical-Model.png"><img class="alignnone size-large wp-image-9976" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Full-Logical-Model-1024x559.png" alt="" width="614" height="335" /></a></p>
<p>But we have more than that. We also have our source-to-target information documented in the semantic metadata layer. We can see that information using the Admin Tool, as depicted below, or we can also use the &#8220;Repository Documentation&#8221; option to generate some documented source-to-target mappings.</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Map-Dimension.png"><img class="size-full wp-image-9883  alignnone" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Map-Dimension.png" alt="" width="671" height="219" /></a></p>
<p>When embarking on ETL development, it&#8217;s common to do SQL prototyping before starting the actual mappings to make sure we understand the particulars of granularity. However, we already have these SQL prototypes in the nqquery.log file&#8230; all we have to do is look at it. The combination of the source-to-target-mapping and the SQL prototypes provide all the artifacts necessary to get started with the ETL.</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Query-Log.png"><img class="alignnone size-large wp-image-9982" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Query-Log-1024x598.png" alt="" width="645" height="377" /></a></p>
<p>When using ETL processing to &#8220;instantiate&#8221; our logical model into the physical world, we can&#8217;t abandon our Agile imperatives: we must still deliver the new content, and corresponding rework, within a single iteration. So whether the end user is opening the user story because the data quality is abysmal, or because the performance is just not good enough, we must vow to deliver the ETL Iteration time-boxed, in exactly the same manner that we delivered the Model-Driven Iteration. So, if we imagine that our user opens a story about data quality in our Customer and Product dimensions, and we decide that all we have time for in this iteration are those two dimension tables, does it make sense for us to deliver those items in a vacuum? With the image below depicting the process flow for an entire subject area, can we deliver it piecemeal instead of all at once?</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Piecemeal-Process-Flow.png"><img class="alignnone size-full wp-image-9968" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Piecemeal-Process-Flow.png" alt="" width="636" height="348" /></a></p>
<p>The answer, of course, is that we can. We&#8217;ll develop the model and ETL exactly as we would if our goal was to plug the dimensions into a complete subject area. We use surrogate keys as the primary key for each dimension table, facilitating joining our dimension tables to completed fact tables. But we don&#8217;t have completed fact tables at this point in our project&#8230; instead we have a series of transaction tables that work together to form the basis of a logical fact table. How can we use a dimension table with a surrogate key to join to our transactional &#8220;fact&#8221; table that doesn&#8217;t yet have these surrogate keys?</p>
<p>We fake it. Along with surrogate keys, the long-standing best practice of dimension table delivery has been to include the source system natural key, as well as effective dates, in all our dimension tables. These attributes are usually included to facilitate slowly-changing dimension (SCD) processing, but we&#8217;ll exploit them for our Agile piecemeal approach as well. So in our example below, we have a properly formed Customer dimension that we want to join to our logical fact table, as depicted below:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Partial-Hybrid-Model-e1327470743307.png"><img class="alignnone size-full wp-image-9995" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Partial-Hybrid-Model-e1327470743307.png" alt="" width="596" height="200" /></a></p>
<p>We start by creating aliases to our transactional &#8220;fact&#8221; tables (called POS_TRANS_HYBRID and POS_TRANS_HEADER_HYBRID in the example above), because we don&#8217;t want to upset the logical table source (LTS) that we are already using for the pure transactional version of the logical fact table. We create a complex join between the customer source system natural key and transaction date in our hybrid alias, and the natural key and effective dates in the dimension table. We use the effective dates as well to make sure we grab the correct version of the customer entity in question in situations where we have enabled Type 2 SCD&#8217;s (the usual standard) in our dimension table.</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Surrogate-Pipeline.png"><img class="alignnone size-large wp-image-10007" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Surrogate-Pipeline-1024x869.png" alt="" width="574" height="486" /></a></p>
<p>This complex logic of using the natural key and effective dates is identical to the logic we would use in what Ralph Kimball calls the &#8220;surrogate pipeline&#8221;: the ETL processing used to replace natural keys with surrogate keys when loading a proper fact table. Using Customer and Sales attributes in an analysis, we can see the actual SQL that&#8217;s generated:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Surrogate-Pipeline-SQL.png"><img class="alignnone size-large wp-image-10025" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Surrogate-Pipeline-SQL-1024x510.png" alt="" width="645" height="321" /></a></p>
<p>We can view this hybrid approach as an intermediate step, but there is also nothing wrong with this as a long-term approach if the users are happy and Exadata makes our queries scream. If you think about it&#8230; a surrogate key is an easy was of representing the natural key of the table, which is the source system natural key plus the unique effective dates for the entity. A surrogate key makes this relationship much easier to envision, and certainly code using SQL, but when we are insulated from the ugliness of the join with Extreme Metadata, do we really care? If our end users ever open a story asking for rework of the fact table, we may consider manifesting that table physically as well. Once complete, we would need to create another LTS for the Customer dimension (using an alias to keep it separate from the table that joins to the transactional tables). This alias would be configured to join directly to the new Sales fact table across the surrogate key&#8230; exactly how we would expect a traditional data warehouse to be modeled in the BMM. The physical model will look nearly identical to our logical model, and the generated SQL will be less interesting:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Fact-LTS.png"><img class="alignnone size-full wp-image-10033" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Fact-LTS.png" alt="" width="221" height="226" /></a></p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Star-Schema-SQL.png"><img class="alignnone size-large wp-image-10029" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Star-Schema-SQL-1024x420.png" alt="" width="645" height="265" /></a></p>
<p>Now that I&#8217;ve described the Model-Driven and ETL Iterations, it&#8217;s time to discuss what I call the Combined Iteration, which is likely what most of the iterations will look like when the project has achieved some maturity. In Combined Iterations, we work on adding new or refactored RPD content alongside new or refactored ETL content in the same iteration. Now the project really makes sense to the end user. We allow the user community&#8211;those who are actually consuming the content&#8211;to dictate to the developers with user stories what they want the developers to work on in the next iteration. The users will constantly open new stories, some asking for new content, and others requesting modifications to existing content. All Agile methodologies put the burden of prioritizing user stories squarely on the shoulders of the user community. Why should IT dictate to the user community where priorities lie? If we have delivered fabulous content sourced with the Model-Driven paradigm, and Exadata provides the performance necessary to make this &#8220;real&#8221; content, then there is no reason for the implementors to dictate to the users the need to manifest that model physically with ETL when they haven&#8217;t asked for it. If whole portions of our data warehouse are never implemented physically with ETL&#8230; do we care? The users are happy with what they have, and they think performance is fine&#8230; do we still force a &#8220;best practice&#8221; of a physical star schema on users who clearly don&#8217;t want it?</p>
<p>So that&#8217;s it for the Extreme BI methodology. At the onset of this series&#8230; I thought it would require five blog posts to make the case, but I was able to do it in four instead. So even when delivering blog posts, I can&#8217;t help but rework as I go along. Long live Agile!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2012/01/agile-exadata-obiee-etl/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Agile Data Warehousing with Exadata and OBIEE: Model-Driven Iteration</title>
		<link>http://www.rittmanmead.com/2012/01/agile-exadata-obiee-model-driven/</link>
		<comments>http://www.rittmanmead.com/2012/01/agile-exadata-obiee-model-driven/#comments</comments>
		<pubDate>Mon, 16 Jan 2012 05:32:10 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[BI (General)]]></category>
		<category><![CDATA[BI 2.0]]></category>
		<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Dimensional Modelling]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Oracle BI Suite EE]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=9825</guid>
		<description><![CDATA[After laying the groundwork with an introduction, and following up with a high-level description of the required puzzle pieces, it&#8217;s time to get down to business and describe how Extreme BI works. At Rittman Mead, we have several projects delivering with this methodology right now, and more in the pipeline. I&#8217;ll gradually introduce the different types of [...]]]></description>
			<content:encoded><![CDATA[<p>After laying the groundwork with an <a title="Agile Data Warehousing with Exadata and OBIEE: Introduction" href="http://www.rittmanmead.com/2011/12/agile-data-warehousing-with-exadata-and-obiee-introduction/" target="_blank">introduction</a>, and following up with a high-level description of the required <a title="Agile Data Warehousing with Exadata and OBIEE: Puzzle Pieces" href="http://www.rittmanmead.com/2011/12/agile-exadata-obiee-puzzle-pieces/" target="_blank">puzzle pieces</a>, it&#8217;s time to get down to business and describe how Extreme BI works. At Rittman Mead, we have several projects delivering with this methodology right now, and more in the pipeline.</p>
<p>I&#8217;ll gradually introduce the different types of generic iterations that we engage in, focusing on what I call the &#8220;model-driven&#8221; iteration for this post. Our first few iterations are always model-driven. We begin when a user opens a user story requesting new content. For any request for new content, we require that all the following elements are including in the story:</p>
<ol>
<li>A narrative about the data they are looking for, and how they want to see it. We are not looking for requirements documents here, but we are looking for the user to give a complete picture of what it is that they need.</li>
<li>An indication of how they report on this content today. In a new data warehouse environment, this would include some sort of report that they are currently running against the source system, and in a perfect world, this would involve the SQL that is used to pull that report.</li>
<li>An indication of data sets that are &#8220;nice to haves&#8221;. This might include data that isn&#8217;t available to them in the current paradigm of the report, or was simply too complicated to pull in that paradigm. After an initial inspection of these nice-to-haves and the complexity involved with including them in this story, the project manager may decide to pull these elements out and put them a separate user story. This, of course, depends on the Agile methodology used, and the individual implementation of that methodology.</li>
</ol>
<p>First we assign the story to an RPD developer, who uses the modeling capabilities in the OBIEE Admin Tool to &#8220;discover&#8221; the logical dimensional model tucked inside the user story, and develop that logical model inside the Business Model and Mapping (BMM) layer. Unlike a &#8220;pure&#8221; dimensional modeling exercise where we focus only on user requirements and pay very little attention to source systems, in model-driven development, we constantly shift between the source of the data, and how best the user story can be solved dimensionally. Instead of working directly against the source system though, we are working against the foundation layer in the Oracle Next-Generation Reference Data Warehouse Architecture. We work from a top-down approach, first creating empty facts and dimensions in the BMM, and mapping them to the foundation layer tables in the physical layer.</p>
<p>To take a simple example, we can see how a series of foundation layer tables developed in 3NF could be mapped to a logical dimension table as our Customer dimension:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Dimension-Join.png"><img class="size-full wp-image-9893 alignnone" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Dimension-Join.png" alt="Model-Driven Development of Dimension Table" width="425" height="208" /></a></p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Map-Dimension.png"><img class="size-full wp-image-9883 alignnone" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Map-Dimension.png" alt="" width="671" height="219" /></a></p>
<p>I rearranged the layout from the Admin Tool to provide an &#8220;ETL-friendly&#8221; view of the mapping. All the way to the right, we can see the logical, dimensional version of our Customer table, and how it maps back to the source tables. This mapping could be quite complicated, with perhaps dozens of tables. The important thing to keep in mind is that this complexity is hidden from not only the consumer of the reports, but also from the developers. We can generate a similar example of what our Sales fact table would look like:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Fact-Join.png"><img class="size-full wp-image-9896 alignnone" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Fact-Join.png" alt="" width="426" height="209" /></a></p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Map-Fact.png"><img class="size-full wp-image-9889 alignnone" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Map-Fact.png" alt="" width="664" height="276" /></a></p>
<p>Another way of making the same point is to look at the complex, transaction model:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Physical-Model-Annotated.png"><img class="size-full wp-image-9904 alignnone" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Physical-Model-Annotated.png" alt="" width="441" height="311" /></a></p>
<p>We can then compare this to the simplified, dimensional model:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Logical-Model-Annotated.png"><img class="size-full wp-image-9905 alignnone" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Logical-Model-Annotated.png" alt="" width="409" height="260" /></a></p>
<p>And finally, when we view the subject area during development of an analyses, all we see are facts and dimensions. The front-end developer can be blissfully ignorant that he or she is developing against a complex transactional schema, because all that is visible is the abstracted logical model:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Astracted-View-for-Developer.png"><img class="alignnone size-full wp-image-9915" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Astracted-View-for-Developer.png" alt="" width="741" height="395" /></a></p>
<p>When mapping the BMM to complex 3NF schemas, the BI Server is very, very smart, and understands how to do more with less. Using the metadata capabilities of OBIEE is superior to other metadata products, or superior to a &#8220;roll-you-own metadata&#8221; approach using database views, because of the following:</p>
<ol>
<li>The generated SQL usually won&#8217;t involve self-joins, even when tables exists in both the logical fact table, and the logical dimension table.</li>
<li>The BI Server will only include tables that are required to facilitate the intelligent request, either because it has columns mapped to the attributes being requested, or because the table is a required reference table to bring disparate tables together. Any tables not required to facilitate the request will be excluded.</li>
</ol>
<p>Since the entire user story needs to be closed in a single iteration, the user who opened the story needs to be able to see the actual content. This means that the development of the analyses (or report) and the dashboard are also required to complete the story. It&#8217;s important to get something in front of the end user immediately, but it doesn&#8217;t have to be perfect. We should focus on a clear, concise analyses in the first iteration, so it&#8217;s easy for the end user to verify that the data is correct. In future iterations, we can deliver high-impact, eye-catching dashboards. Equally important to closing the story is being able to prove that it&#8217;s complete. In Agile methodologies, this is usually referred to as the &#8220;Validation Step&#8221; or &#8220;Showcase&#8221;. Since we have already produced the content, then it&#8217;s easy to prove to the user that the story is complete. But suppose that we believed we couldn&#8217;t deliver new content in a single iteration. That would imply that we would have an iteration during our project that didn&#8217;t include actual end-user content. How would you go about validating or showcasing that content? How would we go about showcasing a completed ETL mapping, for instance, if we haven&#8217;t delivered any content to consume it?</p>
<p>What we have at the end of the iteration is a completely abstracted view of our model: a complex, transactional, 3NF schema presented as a star schema. We are able to deliver portions of a subject area, which is important for time-boxed iterations. The Extreme Metadata of OBIEE 11g allows us to remove this complexity in a single iteration, but it&#8217;s the performance of the Exadata Database Machine that allows us to build real analyses and dashboards and present it to the general user community.</p>
<p>In the next post, we&#8217;ll examine the ETL Iteration, and explore how we can gradually manifest our logical business model into a physical model over time. As you will see, the ETL iteration is an optional one&#8230; it will be absolutely necessary in some environments, and completely superflous in others.</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Physical-Model-Annotated.png"></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2012/01/agile-exadata-obiee-model-driven/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Agile Data Warehousing with Exadata and OBIEE: Puzzle Pieces</title>
		<link>http://www.rittmanmead.com/2011/12/agile-exadata-obiee-puzzle-pieces/</link>
		<comments>http://www.rittmanmead.com/2011/12/agile-exadata-obiee-puzzle-pieces/#comments</comments>
		<pubDate>Wed, 28 Dec 2011 19:39:56 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Dimensional Modelling]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Oracle BI Suite EE]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=9637</guid>
		<description><![CDATA[In the previous post, I laid the groundwork for describing Extreme BI: a combination of Exadata and OBIEE delivered with an Agile spirit. I discussed that the usual approach to Agile data warehousing is not Agile at all due to the violation of it&#8217;s main principle: working software delivered iteratively. If you haven&#8217;t already deduced [...]]]></description>
			<content:encoded><![CDATA[<p>In the <a title="Agile Data Warehousing with Exadata and OBIEE: Introduction" href="http://www.rittmanmead.com/2011/12/agile-data-warehousing-with-exadata-and-obiee-introduction/" target="_blank">previous post</a>, I laid the groundwork for describing Extreme BI: a combination of Exadata and OBIEE delivered with an Agile spirit. I discussed that the usual approach to Agile data warehousing is not Agile at all due to the violation of it&#8217;s main principle: working software delivered iteratively.</p>
<p>If you haven&#8217;t already deduced from my first post &#8212; or if you haven&#8217;t already seen me speak on this topic &#8212; what I am recommending is bypassing, either temporarily or permanently, the inhibitors specific to data warehousing projects which limit our ability to deliver working software quickly. Specifically, I&#8217;m recommending that we wait to build and populate physical star schemas until a later phase, if at all. Remember the two reasons that we build dimensional models: model simplicity and performance. With our Extreme BI solution, we have tools to counter both of those reasons. We have OBIEE 11g, with a rich metadata layer that presents our underlying data model, even if it is transactional, as a star schema to the end user. This removes our dependency on a simplistic physical model to provide a simplistic logical model to end users. We also have Exadata, which delivers world-class performance against any type of model, and can bridge the performance gap afforded by star schemas. With these tools at our disposal, we can postpone the long process of building dimensional models, at least for the first few iterations. This is the only way to get working software in front of the end user in a single iteration, and, as I will argue, this is the best way to collaborate with an end user and deliver the content they are expecting.</p>
<p>Of the puzzle pieces we need to deliver this model, the first is the <a href="http://www.rittmanmead.com/wp-content/uploads/2011/12/058925.pdf" target="_blank">Oracle Next-Generation Reference DW Architecture</a> (we need an acronym for that), which Mark has already written about in-depth <a title="Drilling Down in the Oracle Next-Generation Reference DW Architecture" href="http://www.rittmanmead.com/2009/07/drilling-down-in-the-oracle-next-generation-reference-dw-architecture/" target="_blank">here</a>. As you browse through this post, pay special attention to his formulation of the foundation layer, which is the most important layer for delivering Extreme BI.</p>
<div id="attachment_9672" class="wp-caption aligncenter" style="width: 673px"><a href="http://www.rittmanmead.com/wp-content/uploads/2011/12/next-gen.png"><img class="size-large wp-image-9672    " src="http://www.rittmanmead.com/wp-content/uploads/2011/12/next-gen-1024x627.png" alt="" width="663" height="407" /></a><p class="wp-caption-text">Oracle Next-Generation Reference DW Architecture</p></div>
<h2>Foundation Layer</h2>
<p>This is our &#8220;process-neutral&#8221; layer, which means simply that it isn&#8217;t imbued with requirements about what users want and how they want it. Instead, the foundation layer has one job and one job only: tracking what happened in our source systems. Typically, the foundation layer logical model looks identical to the source systems, except that we have a few additional metadata columns on each record such as commit timestamps and Oracle Database system change numbers (SCN&#8217;s). There are other, more complex solutions for modeling the foundation layer when the 3NF from the source system or systems is not sufficient, such as <a title="Data Vault Modeling" href="http://en.wikipedia.org/wiki/Data_Vault_Modeling" target="_blank">data vault</a>. Our foundation layer is generally &#8220;insert-only&#8221;, meaning we track all history so that we are insulated from changing user requirements in the near and distant futures.</p>
<p><strong>UPDATE: </strong> Kent Graziano, a major data vault evangelist, has started <a title="Oracle Data Warrior" href="http://kentgraziano.com/" target="_blank">blogging</a>. Perhaps with some pressure from the public, we could &#8220;encourage&#8221; him to blog on what data vault would look like in a standard foundation layer.</p>
<h2>Capturing Change</h2>
<p>Also required for delivering Extreme BI is a process for capturing change from the source systems and rapidly applying it to the foundation layer, which I described briefly in one of my posts on <a title="Real-time BI: An Introduction" href="http://www.rittmanmead.com/2011/05/real-time-bi-an-introduction/" target="_blank">real-time data warehousing</a>. We have a bit of a tug-of-war at this point between Oracle Streams and Oracle GoldenGate. GoldenGate is the stated platform of the future because it’s a simple, flexible, powerful and resilient replication technology. However, it does not yet have powerful change data capture functionality specific to data warehouses, such as easy subscriptions to raw changed data, or support for multiple subscription groups. You can, in general, work around these limitations using the INSERTALLRECORDS parameter and some custom code (perhaps fodder for a future blog post). Regardless of the technology, Extreme BI requires a process for capturing and applying source system changes quickly and efficiently to the foundation layer on the Exadata Database Machine.</p>
<h2>Extreme Performance</h2>
<p>Although I&#8217;ll drill into more detail in the next post, the reason we need Extreme Performance is to offset the performance gains we usually get from star schemas, since we won&#8217;t be building those, at least not in the initial iterations. Although Rittman Mead has deployed a variant of this methodology sans Exadata using a powerful Oracle Database RAC instead, there is no substitute for Exadata. Although the hardware on the Database Machine is superb, it&#8217;s really the software that is a game-changer. The most extraordinary features include <a title="Smart Scans Meet Storage Indexes" href="http://www.oracle.com/technetwork/issue-archive/2011/11-may/o31exadata-354069.html" target="_blank">smart scan and storage indexes</a>, as well as hybrid columnar compression, which Mark talks about <a title="Hybrid Columnar Compression in Oracle Exadata v2" href="http://www.rittmanmead.com/2010/01/hybrid-columnar-compression-in-oracle-exadata-v2/" target="_blank">here</a> and references an article by Arup Nanda found <a title="Compressing Columns" href="http://www.oracle.com/technetwork/issue-archive/2010/10-jan/o10compression-082302.html" target="_blank">here</a>. For years now, with standard Oracle data warehouses, we&#8217;ve pushed the architecture to it&#8217;s limits trying to reduce IO contention at the cost of CPU utilization, using database features such as partitioning, parallel query and basic block compression. But Exadata Storage can eliminate the IO boogeyman using combinations of these standard features plus the Exadata-only features to elevate the query performance against 3NF schemas on par with traditional star schemas and beyond.</p>
<p style="text-align: center"><a href="http://www.rittmanmead.com/wp-content/uploads/2011/12/Terabytes-to-Gigabytes.png"><img class="aligncenter size-full wp-image-9739" src="http://www.rittmanmead.com/wp-content/uploads/2011/12/Terabytes-to-Gigabytes.png" alt="" width="617" height="352" /></a></p>
<h2>Extreme Metadata</h2>
<p>Extreme performance is only half the battle&#8230; we also need Extreme Metadata to provide us the proper level of abstraction so that report and dashboard developers still have a simplistic model to report against. This is what OBIEE 11g brings to the table. We have also delivered a variant of this methodology without OBIEE, using Cognos instead, which has a metadata layer called <a title="Framework Manager" href="http://www.ironsidegroup.com/2010/07/08/best-practices-in-cognos-8-framework-manager-model-design/" target="_blank">Framework Manager</a>. As with Exadata, the BI Server has no equal in the metadata department, so my advice&#8230; don&#8217;t substitute ingredients.</p>
<p>Consider, for a moment, the evolution of dimensional modeling in deploying a data warehouse. Not too long ago, we had to solve most data warehousing issues with the logical model because BI tools were simplistic. Generally&#8230; there was no abstraction of the physical into the logical, unless you categorize the renaming of columns as abstraction. As these tools evolved, we often found ourselves with a choice: solve some user need in the logical model, or solve it with the feature set of the BI tool. The use of aggregation in data warehousing is a perfect example of this evolution. Designing aggregate tables used to be just another part of the logical modeling exercise, and were generally represented in the published data model for the EDW. But now, building aggregates is more of a technical implementation than a logical one, as either the BI Server or the Oracle Database can handle the transparent navigation to aggregate tables.</p>
<p>The metadata that OBIEE provides adds two necessary features for Agile delivery. First, we are able to report against complex transactional schemas, but still expose those schemas as simplified dimensional models. This allows us to bypass the complex ETL process at least initially so that we can get new subject areas into the users hands in a single iteration. But OBIEE&#8217;s capability to map multiple Logical Table Sources (LTS&#8217;s) for the same logical table makes it easy to modify &#8212; or &#8220;remap&#8221; &#8212; the source of our logical tables over time. So, in later iterations, if we decide that it&#8217;s necessary to embark upon complex ETL processes to complete user stories, we can do this in the metadata layer without affecting our reports and dashboards, or changing the logical model that report developers are used to seeing.</p>
<div id="attachment_9754" class="wp-caption aligncenter" style="width: 612px"><a href="http://www.rittmanmead.com/wp-content/uploads/2011/12/semantic-model.031.png"><img class="size-full wp-image-9754 " src="http://www.rittmanmead.com/wp-content/uploads/2011/12/semantic-model.031.png" alt="" width="602" height="378" /></a><p class="wp-caption-text">Flow of Data Through the Three-Layer Semantic Model</p></div>
<h2>More to Come&#8230;</h2>
<p>In the next post, I&#8217;ll describe what I call the Model-Driven Iteration, where we use OBIEE against the foundation layer to expose new subject areas in a single iteration. After that, I&#8217;ll describe ETL Iterations, where we transform a portion of our model iteratively using ETL tools such as ODI, OWB or Informatica. Finally, I&#8217;ll describe what I call Combined Iterations, where both Model-Driven activity and ETL activity are going on at the same time.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2011/12/agile-exadata-obiee-puzzle-pieces/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Agile Data Warehousing with Exadata and OBIEE: Introduction</title>
		<link>http://www.rittmanmead.com/2011/12/agile-data-warehousing-with-exadata-and-obiee-introduction/</link>
		<comments>http://www.rittmanmead.com/2011/12/agile-data-warehousing-with-exadata-and-obiee-introduction/#comments</comments>
		<pubDate>Wed, 21 Dec 2011 15:48:55 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Dimensional Modelling]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Oracle BI Suite EE]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=9597</guid>
		<description><![CDATA[Over the last year, I&#8217;ve been speaking at conferences on one subject more than any others: Agile Data Warehousing with Exadata and OBIEE. Although I&#8217;ve been busy with client work and growing the US business, I realize I need to dedicate more time to blogging again, and this seemed like the logical subject to take [...]]]></description>
			<content:encoded><![CDATA[<p>Over the last year, I&#8217;ve been speaking at conferences on one subject more than any others: Agile Data Warehousing with Exadata and OBIEE. Although I&#8217;ve been busy with client work and growing the US business, I realize I need to dedicate more time to blogging again, and this seemed like the logical subject to take up. So I&#8217;ll use the next few blog posts to make my case for what I like to call Extreme BI: an Agile approach to data warehousing using the combination of Extreme Performance and Extreme Metadata.</p>
<p>In a standard data warehouse implementation, whether we are walking in the Inmon or Kimball camps, some portion of our data model will be dimensional in nature; a star schema with facts and dimensions. So let me pose a question, which I think will lend itself well to diving into the Extreme BI discussion: Why do we build dimensional models? The first reason is simplicity. We want to model our reporting structures in a way that makes sense to the business user. The standard OLTP data model that takes two of the four walls in the conference room to display is just never going to make sense to your average business user. At the end of a logical modeling exercise, I expect the end-user to have a look at a completed dimensional model and say: &#8220;Yep&#8230; that&#8217;s our business alright&#8221;. The second reason we build dimensional models is for performance. Denormalizing highly complex transactional models into simplified star schemas generally produces tremendous performance gains.</p>
<p>So my follow-up question: can the combination of Exadata and OBIEE, or Extreme BI, <em>actually change the way we deliver projects? </em>We&#8217;ve all seen the Exadata performance numbers that Oracle publishes, and I can tell you first hand the performance is impressive. Can this Extreme Performance combined with the Extreme Metadata that OBIEE provides give us a more compelling case for delivering data warehouses using Agile methodologies?</p>
<p>To start with, I&#8217;d like to paint a picture of what the typical waterfall data warehousing project looks like. The tasks we usually have to complete, in order, are the following:</p>
<ol>
<li>User interviews</li>
<li>Construct requirement documents</li>
<li>Create logical data model</li>
<li>SQL prototyping of source transactional models</li>
<li>Document source-to-target mappings</li>
<li>ETL development</li>
<li>Front-end development (analyses and dashboards)</li>
<li>Performance tuning</li>
</ol>
<p>Raise your hand if this looks familiar. We would have to go through all these steps, which could take months, before end users can see the fruits of our labor. To mitigate this scenario, organizations will attempt to deliver data warehouses using &#8220;Agile&#8221; methodologies. What this usually means, from my experience, is a simple repackaging of the same waterfall project plan into &#8220;iterations&#8221; or &#8220;sprints&#8221;, so that the project can be delivered iteratively. So the process might look like the following:</p>
<ol>
<li>Iteration 1: Interviews and user requirements</li>
<li>Iteration 2: Logical modeling</li>
<li>Iteration 3: ETL Development</li>
<li>Iteration 4: Front-end development</li>
</ol>
<p>But this, ladies and gentlemen, is not Agile. To get an understanding of what lies at the heart of Agile development, we need to look no further than the <a title="The Agile Manisfesto" href="http://agilemanifesto.org/" target="_blank">Agile Manifesto</a>, or the history of the <a title="The Agile Movement" href="http://en.wikipedia.org/wiki/Agile_software_development" target="_blank">Agile Movement</a>. When examining the different methodologies, there is one major theme that permeates all of them: working software delivered iteratively. It&#8217;s not enough to simply deliver the same old waterfall methodology in &#8220;sprints&#8221; or &#8220;iterations&#8221;, because, at the end of those iterations, we don&#8217;t have any working software&#8230; software that end users can actually use to improve their job or help them make better decisions. In the example above, we still require four iterations before we get any usable content. It doesn&#8217;t matter if we&#8217;ve written some complex ETL to load a fact table if the end user doesn&#8217;t have a working dashboard to go along with it.</p>
<p>To apply the Agile Manifesto to data warehouse delivery, it&#8217;s the following key elements that are required for us to deliver with a true Agile spirit:</p>
<ol>
<li>User stories instead of requirements documents: a user asks for particular content through a narrative process, and includes in that story whatever process they currently use to generate that content.</li>
<li>Time-boxed iterations: iterations always have a standard length, and we choose one or more user stories to complete in that iteration.</li>
<li>Rework is part of the game: there aren&#8217;t any missed requirements&#8230; only those that haven&#8217;t been addressed yet.</li>
</ol>
<p>I&#8217;ve been conscious not to prescribe any distinct Agile methodology, though I can&#8217;t help using more Scrum-like concepts in this formulation. However, I think this list is generic enough to apply to most methodologies. Over the next few posts, I&#8217;ll discuss the necessary puzzle pieces to engage in Extreme BI, as well as how we might implement new subject area content in a single iteration. Additionally, I&#8217;ll discuss how these implementations might be reworked, or &#8220;refactored&#8221;, over several iterations to produce data warehouses that respond to user stories: what users want and when they want it.</p>
<p><strong>Follow-up Posts</strong></p>
<p><a title="Agile Data Warehousing with Exadata and OBIEE: Puzzle Pieces" href="http://www.rittmanmead.com/2011/12/agile-exadata-obiee-puzzle-pieces/" target="_blank">Agile Data Warehousing with Exadata and OBIEE: Puzzle Pieces</a></p>
<p><a title="Agile Data Warehousing with Exadata and OBIEE: Model-Driven Iteration" href="http://www.rittmanmead.com/2012/01/agile-exadata-obiee-model-driven/">Agile Data Warehousing with Exadata and OBIEE: Model-Driven Iteration</a></p>
<p><a title="Agile Data Warehousing with Exadata and OBIEE: ETL Iteration" href="http://www.rittmanmead.com/2012/01/agile-exadata-obiee-etl/" target="_blank">Agile Data Warehousing with Exadata and OBIEE: ETL Iteration</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2011/12/agile-data-warehousing-with-exadata-and-obiee-introduction/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Oracle OLAP 11gR2 and Single-line Indexed Attributes</title>
		<link>http://www.rittmanmead.com/2011/07/oracle-olap-11gr2-and-single-line-indexed-attributes/</link>
		<comments>http://www.rittmanmead.com/2011/07/oracle-olap-11gr2-and-single-line-indexed-attributes/#comments</comments>
		<pubDate>Wed, 20 Jul 2011 20:48:47 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[Oracle OLAP]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=8661</guid>
		<description><![CDATA[Just a quick post today to demonstrate an issue I ran into with an Oracle OLAP 11gR2 dimension today. I was maintaining the dimension when I encountered the following error: I found Oracle support document 1258925.1, which has a handle on the problem. It describes a deficiency OLAP has with indexing attributes that contain newline characters. [...]]]></description>
			<content:encoded><![CDATA[<p>Just a quick post today to demonstrate an issue I ran into with an Oracle OLAP 11gR2 dimension today. I was maintaining the dimension when I encountered the following error:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2011/07/single-level-dimensions.png"><img class="alignnone size-full wp-image-8662" src="http://www.rittmanmead.com/wp-content/uploads/2011/07/single-level-dimensions.png" alt="" width="432" height="198" /></a></p>
<p>I found Oracle support document <a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;doctype=PROBLEM&amp;id=1258925.1">1258925.1</a>, which has a handle on the problem. It describes a deficiency OLAP has with indexing attributes that contain newline characters. However&#8230; the note references version 11.2.0.1 of Analytic Workspace Manager (AWM), where indexing of all attributes is a binary choice. However, in 11.2.0.2 of AWM, we can choose to index on an attribute-by-attribute basis, as seen in the attribute details pane:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2011/07/indexed-attribute.png"><img class="alignnone size-full wp-image-8663" src="http://www.rittmanmead.com/wp-content/uploads/2011/07/indexed-attribute.png" alt="" width="402" height="321" /></a></p>
<p>So here&#8217;s what I know: I have an indexed attribute with newline characters. ETL processing and data quality notwithstanding (that seems like a strange thing to survive an ETL process), I need to find out which attribute to un-index. Problem is&#8230; there is nothing in the maintenance logs that tell me which attribute is the problem. (By the way: if this information exists and you know where it is, then please comment and I&#8217;ll update the blog post.)</p>
<p>So I wrote this little piece of PL/SQL that did the trick for me, and I wanted to share it. CHR(10) is the construct we use in PL/SQL to denote a newline character, so I construct a query against each column to determine whether it contains any:</p>
<pre>SQL&gt; DECLARE
  2     l_table      VARCHAR2(30)    := 'DIM_FUND';
  3     l_results    NUMBER;
  4  BEGIN
  5     FOR x IN ( select 'select count(*) from '
  6                       ||l_table
  7                       ||' where regexp_like('
  8                       ||column_name
  9                       ||', chr(10))' stmt,
 10                      column_name
 11                  from dba_tab_columns
 12                 where table_name=l_table
 13              )
 14     LOOP
 15  --      dbms_output.put_line( x.stmt );
 16
 17        EXECUTE IMMEDIATE x.stmt
 18        INTO l_results;
 19
 20        IF l_results &gt; 0
 21        THEN
 22           dbms_output.put_line
 23           ( x.column_name
 24             ||': '
 25             || l_results
 26           );
 27
 28        END IF;
 29
 30     END LOOP;
 31
 32  END;
 33  /
FUND_FIN_MANAGER_TITLE: 3

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.82
SQL&gt;</pre>
<p>This did the trick! I un-indexed the attribute in AWM and the dimension maintenance procedure was successful.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2011/07/oracle-olap-11gr2-and-single-line-indexed-attributes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Real-time BI: EDW with a Real-time Component</title>
		<link>http://www.rittmanmead.com/2011/07/real-time-bi-edw-with-a-real-time-component/</link>
		<comments>http://www.rittmanmead.com/2011/07/real-time-bi-edw-with-a-real-time-component/#comments</comments>
		<pubDate>Wed, 06 Jul 2011 20:46:55 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Dimensional Modelling]]></category>
		<category><![CDATA[Oracle BI Suite EE]]></category>
		<category><![CDATA[Oracle Database]]></category>
		<category><![CDATA[Oracle Warehouse Builder]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=8630</guid>
		<description><![CDATA[I apologize for the long delay in getting this last portion of the Real-time discussion in place. Since I wrote the first two installments, we&#8217;ve had the BI Forum (US and UK versions), plus a flurry of activity around Rittman Mead in the US, followed up by KScope11. But a promise is a promise, and [...]]]></description>
			<content:encoded><![CDATA[<p>I apologize for the long delay in getting this last portion of the Real-time discussion in place. Since I wrote the first two installments, we&#8217;ve had the BI Forum (US and UK versions), plus a flurry of activity around Rittman Mead in the US, followed up by KScope11. But a promise is a promise, and here goes with the conclusion.</p>
<p>I laid out the general vocabulary and considerations for Real-time BI in <a href="http://www.rittmanmead.com/2011/05/real-time-bi-an-introduction/">my first post</a> in this series, and then followed up with how to implement Real-time BI using <a href="http://www.rittmanmead.com/2011/05/real-time-bi-federated-oltpedw-reporting/">a federated approach</a> that relies on the metadata capabilities OBIEE to blend two different environments into one. Now I&#8217;d like to discuss how we might implement a Real-time solution by relying on ETL instead of BI Tool metadata. I call this EDW with a Real-Time Component.</p>
<p>Whereas the Federated OLTP/EDW Reporting option provides us an option to layer real-time data into an otherwise classic batch-loaded EDW, delivering the EDW with a Real-Time Component requires designing an EDW from the ground up that supports real-time reporting. Specifically, we have to design our fact tables to support what Ralph Kimball calls the “real-time partition” in his book <em>The Kimball Group Reader</em>: “To achieve real-time reporting, we build a special partition that is physically and administratively separated from the conventional static data warehouse tables. Actually, the name partition is a little misleading. The real-time partition may be a separate table, subject to special rules for update and query.” We construct a separate section for each of our fact tables to facilitate the following 4 requirements, as defined by Kimball:</p>
<ol>
<li>Contain all activity since the last time the load was run</li>
<li>Link seamlessly to the grain of the static data warehouse tables</li>
<li>Be indexed so lightly that incoming data can “dribble in”</li>
<li>Support highly responsive queries</li>
</ol>
<p><img style="margin-left: auto;margin-right: auto" src="http://www.rittmanmead.com/wp-content/uploads/2011/07/real-time-partition.png" border="0" alt="Real time partition" width="600" height="375" /></p>
<p>So we modify our model to support the interaction of real-time and static data, but we also modify our ETL to support this. In fact, to construct an EDW with a Real-Time Component, we have to build some very intricate interaction between the database, the data model and ETL processes. The static fact table is partitioned on a date data-type using standard Oracle partitioning strategies. The real-time partition is structured in such a way as to be loadable throughout the day. In other words, there are no indexes or constraints enabled on the table. ETL against the real-time partition uses a process comparable to traditional load scenarios, but using micro-batch instead, running as often as 100 times a day or more. Alternative methods include transactional style, record-by-record loading, possible using web services or message-based system such as JMS queues.</p>
<p>We  effectively want to build a single logical fact table out of the combination of the static EDW fact table and the real-time fact partition. There are several ways to do this. We could use OBIEE fragmentation for this, as we saw in the <a href="http://www.rittmanmead.com/2011/05/real-time-bi-federated-oltpedw-reporting/">last post.</a> This would work, but it&#8217;s not what I recommend. The reason we used fragmentation in the last post is because we were joining two completely different data sets across conformed dimensions into a unified model. However, with the real-time partition, we have two tables that have exactly the same structure—both using the same surrogate keys to the same dimension tables—just separated across different segments for performance reasons. In this case, I choose to UNION the two datasets with either a database view, or an opaque view in OBIEE.</p>
<p><img style="margin-left: auto;margin-right: auto" src="http://www.rittmanmead.com/wp-content/uploads/2011/07/opaque-union-view.png" border="0" alt="Opaque union view" width="542" height="553" /></p>
<p>This works because we no longer have to control which source the rows will come from in particular situations: we simply pull all the rows, and use standard WHERE filters to limit the rows where applicable, and like the pruning the BI Server did for us in the last post, the Oracle Database will do for us in this case. We can, however, still present the static fact tables in situations that merit it: I&#8217;m thinking of financial reports here. Accountants don&#8217;t usually like their reports giving different results every time they run them.</p>
<p>We have one issue with the load of the real-time partition: we are assuming that we receive all of our dimension data right along with our fact data in clean CDC subscription groups. That would likely be the case if we were pulling all the data for our data warehouse from a single source-system, but with enterprise data warehouses, that is rarely the case. Receiving dimension data early causes no problems with our load scenario; it doesn’t matter if we do the surrogate key lookup for the fact table load hours or days later than the dimensions. Receiving the fact table data early does present us with ETL logic issues: the correct dimension record may or may not be there when it’s time to load the facts.</p>
<p>There is a simple strategy to handle early-arriving facts. In our ETL, we implement a process to insure that our facts are at least reportable intra-day:</p>
<ol>
<li>If a dimension record exists for the current business or natural key we are interested in, then grab the latest record. This is the best we can do at this point, and will usually be the correct value.</li>
<li>If no dimension record exists yet for the current natural key, then use a default record type equating to “Not Known Yet.” Though it’s not sexy for intra-day reporting, it at least makes the data available across the dimensions we do know about.</li>
<li>As we approach the end of the day and prepare to “close the books” for the current day, we should have run all dimension loads—even late arriving dimensions—so that our dimension tables are all up to date. At this point we run a corrective mapping to update all the fact records in the real-time partition with the right surrogate keys. This would likely be a MERGE statement, or a TRUNCATE/INSERT style mapping. From a performance perspective, my bet is on the latter.</li>
</ol>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2011/07/outer-join-mapping1.png"><img class="size-large wp-image-8631 alignnone" src="http://www.rittmanmead.com/wp-content/uploads/2011/07/outer-join-mapping1-1024x354.png" alt="" width="737" height="255" /></a></p>
<p>&nbsp;</p>
<p>The above mapping loads the real-time partition in a micro-batch style doing an outer join to the CUSTOMER_DIM table and writing the &#8220;Not Known Yet&#8221; row in case a customer is not found. Also, I am employing a Splitter Operator in OWB, but I tricked it out to force it to load all rows to BOTH tables: SALES_FACT_RT and SALES_STG_RT. The reason for this is that we don&#8217;t write dimension natural keys into our fact tables, though I&#8217;ve seen that technique employed in some real-time implementations. So when it&#8217;s time to run our corrective mapping to correct our fact table data, we just join the SALES_STG_RT table to the now-correct dimension tables and produce the right surrogate keys for each fact record, and load the results into SALES_FACT_RT.</p>
<p>When “closing the books” on the day, we build indexes and constraints on the real-time partition that match those on the partitioned fact table. Once this step is complete, we can then use a partition-exchange operation to combine the real-time partition as part of the static fact table. In Oracle, this is a fast, dictionary update, and occurs almost instantaneously.<br />
Obviously, our partitioning choice for the fact table will determine exactly how this partition-exchange will occur. If we’ll agree to partition the fact table by DAY, then we can use Oracle Interval partitioning, available in Oracle 11gR1 and beyond. We have to make this concession because Interval partitioning tables cannot have partitions in the same table that contain different range-based boundaries. For instance, we can’t have some MONTH-based partitions, while also having some DAY-based partitions, as we can with regular range-based partitioning. Using Interval partitioning is the easiest method, however, because it requires the least amount of partition maintenance as part of the load. For instance, consider the SALES_FACT table listed below, using Interval partitioning on the SALES_DATE_KEY, which we partition on at the DAY grain:</p>
<pre>CREATE TABLE sales_fact
       (
         customer_key           NUMBER           NOT NULL,
         product_key            NUMBER           NOT NULL,
         staff_key              NUMBER           NOT NULL,
         store_key              NUMBER           NOT NULL,
         sales_date_key         DATE             NOT NULL,
         trans_id               NUMBER,
         trans_line_id          NUMBER,
         sales_date             DATE,
         unit_price             NUMBER,
         quantity               NUMBER,
         amount                 NUMBER
       )
       partition BY range (sales_date_key)
       interval (numtodsinterval(1,'DAY'))
       (
         partition sales_fact_2006 VALUES less than (to_date('2007-01-01','YYYY-MM-DD'))
       )
       COMPRESS
/</pre>
<p>Each time we load a record into SALES_FACT for which no partition currently exists, Oracle will spawn one for the table. But based on our real-time requirements, we will use a partition-exchange operation every day to close the books on the current day processing, so each day, we will need to spawn a clean, new partition to facilitate that partition-exchange. All we need to do to make this happen is issue an insert statement with a DATE value for the partitioning key that equates to TRUNC(SYSDATE). For instance, the following statement would generate a new partition that we can use for the exchange:</p>
<pre>SQL&gt; INSERT INTO gcbc_edw.sales_fact
  2         (
  3           customer_key,
  4           product_key,
  5           staff_key,
  6           store_key,
  7           sales_date_key,
  8           trans_id,
  9           trans_line_id,
 10           sales_date,
 11           unit_price,
 12           quantity,
 13           amount)
 14         VALUES
 15         (
 16           -1,
 17           -1,
 18           -1,
 19           -1,
 20           trunc(SYSDATE),
 21           -1,
 22           -1,
 23           SYSDATE,
 24           0,
 25           0,
 26           0
 27         )
 28  /

1 row created.

Elapsed: 00:00:00.01
SQL&gt;</pre>
<p>Once the insert has created our new SYSDATE-based partition, we can exchange the real-time partition in for this new partition. We can use the new PARTITION FOR clause — which allows us to reference partition names using partition key values — with a slight caveat. Though we can’t use SYSDATE explicitly in the DDL statement, we can reference it implicitly:</p>
<pre>SQL&gt; DECLARE
  2     l_date DATE := SYSDATE;
  3     l_sql  LONG;
  4  BEGIN
  5     l_sql :=   q'|alter table gcbc_edw.sales_fact exchange partition|'
  6             || chr(10)
  7             || q'|for ('|'
  8             || l_date
  9             || q'|') with table gcbc_edw.sales_fact_rt|';
 10
 11     dbms_output.put_line( l_sql );
 12     EXECUTE IMMEDIATE( l_sql );
 13  END;
 14  /

alter table gcbc_edw.sales_fact exchange partition
for ('03/01/2011 09:38:33 PM') with table gcbc_edw.sales_fact_rt

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.07
SQL&gt;</pre>
<p>Using the preferred Interval partitioning option, the final “close the books” process flow is shown below. The first step that is taken is to run any late-arriving dimension mappings, in this example, the MAP_CUSTOMER_DIM mapping. Once all the dimensions are up-to-date, we can run the process that corrects all the dimension keys in the real-time partition. Remember, the real-time partition contains small data sets, so updating these records should not be resource intensive. In this scenario, the mapping MAP_CORRECT_SALES_FACT_RT issues an Oracle MERGE statement, but it is quite likely that a TRUNCATE/INSERT statement would work just as well. Once all the data in the real-time partition is correct and ready to go, we issue the MAP_CREATE_PARTITION mapping which uses an insert statement to spawn a new partition, and then the EXCHANGE_PARTITION PL/SQL procedure builds indexes and constraints, and completes the process by issuing the partition-exchange statement.</p>
<p><img style="margin-left: auto;margin-right: auto" src="http://www.rittmanmead.com/wp-content/uploads/2011/07/corrective-process-flow1.png" border="0" alt="Corrective process flow" width="545" height="275" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2011/07/real-time-bi-edw-with-a-real-time-component/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>OBIEE 11g Bootcamp Scheduled for August 1st &#8211; 5th in Atlanta</title>
		<link>http://www.rittmanmead.com/2011/07/obiee-11g-bootcamp-scheduled-for-august-1st-5th-in-atlanta/</link>
		<comments>http://www.rittmanmead.com/2011/07/obiee-11g-bootcamp-scheduled-for-august-1st-5th-in-atlanta/#comments</comments>
		<pubDate>Wed, 06 Jul 2011 04:51:56 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[Courses]]></category>
		<category><![CDATA[Oracle BI Suite EE]]></category>
		<category><![CDATA[Professional]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=8616</guid>
		<description><![CDATA[This is just a quick note to announce that we have scheduled another OBIEE 11g Bootcamp course in Atlanta in early August. We&#8217;ve been working on the OBIEE 11g material since we received the first beta release, and we&#8217;ve run successive versions of this material through two rounds of international Training Days events last year [...]]]></description>
			<content:encoded><![CDATA[<p>This is just a quick note to announce that we have scheduled another OBIEE 11g Bootcamp course in Atlanta in early August. We&#8217;ve been working on the OBIEE 11g material since we received the first beta release, and we&#8217;ve run successive versions of this material through two rounds of international Training Days events last year and this year which took us to London, Atlanta, Bangalore and Belgium. Having run the full-blown Bootcamp course successfully several times in the last few months, we are pleased to offer attendees another chance to attend this course publicly.</p>
<p>For more information, see our <a href="http://www.rittmanmead.com/training/">training page</a> under the section Scheduled Public Training Courses. If you have additional questions, feel free to reach out to us at training@rittmanmead.com.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2011/07/obiee-11g-bootcamp-scheduled-for-august-1st-5th-in-atlanta/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Rittman Mead at ODTUG Kscope 2011</title>
		<link>http://www.rittmanmead.com/2011/06/rittman-mead-at-odtug-kscope-2011/</link>
		<comments>http://www.rittmanmead.com/2011/06/rittman-mead-at-odtug-kscope-2011/#comments</comments>
		<pubDate>Sat, 25 Jun 2011 15:02:12 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Oracle BI Suite EE]]></category>
		<category><![CDATA[Oracle EPM]]></category>
		<category><![CDATA[Rittman Mead]]></category>
		<category><![CDATA[User Groups & Conferences]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=8537</guid>
		<description><![CDATA[I&#8217;m still not used to saying &#8220;Kscope&#8221;&#8230; it sounds like a medical screening that I know I should get when I turn 40. Regardless, I&#8217;m looking forward to the event for all the usual reasons: seeing good friends, seeing all the great speakers in the Oracle community, and generally celebrating what it is we all [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m still not used to saying &#8220;Kscope&#8221;&#8230; it sounds like a medical screening that I know I should get when I turn 40. Regardless, I&#8217;m looking forward to the event for all the usual reasons: seeing good friends, seeing all the great speakers in the Oracle community, and generally celebrating what it is we all do for a living. As I&#8217;ve documented over the years (<a href="http://www.rittmanmead.com/2010/07/kaleidoscope-is-a-wrap/">here</a> for instance), Kaleidoscope is one of my favorite conferences because it seems to be the one the community has the most control over. This year, we&#8217;ll be in Long Beach, and you can find out all the relevant facts at the <a href="http://kscope11.com/">Kaleidoscope web site</a>.</p>
<p><img style="margin-left:auto;margin-right:auto" src="http://www.rittmanmead.com/wp-content/uploads/2011/06/kscope2011.png" alt="Kscope2011" border="0" width="600" height="104" /></p>
<p>Of course, Mark Rittman will be there in his capacity as Rittman Mead Technical Director and Evangelist, Oracle ACE Director, and also, as an ODTUG officer. Marks sessions and events are listed here:</p>
<p><strong>Sunday: 8.30am &#8211; 4.30pm, Room 202B: BI Symposium (Organizer)<br />
Monday: 1.15pm &#8211; 2.15pm, Room 101A: OBIEE 11g Answers, Dashboards, Scorecards &amp; Reporting New Features (Presentation)<br />
Tuesday: 12.15pm &#8211; 1.45pm, Room Promenade B&amp;C: BI &amp; EPM Lunch &amp; Learn with the ACE Directors (Panelist)<br />
Wednesday: 9.45am &#8211; 10.45am, 101A: OBIEE 11g Architecture &amp; Internals (Presentation)</strong></p>
<p>From Rittman Mead America, we have Charles Elliott, our Senior OBIEE specialist delivering an OBIEE 11g Hands-On Training (HoT) session where attendees will learn to use New Report Prompts, Action Links, Custom Groups, and Hierarchical Columns. If you haven&#8217;t yet experimented with the new front-end features in OBIEE 11g, then this is the HoT for you:</p>
<p><strong>Wednesday: 1:45pm &#8211; 5:15pm, Hyatt Seaview: HoT Session F, Oracle BI 11g Answers and Dashboards (Hands-On Training)</strong></p>
<p>I&#8217;ll be attending as well. I&#8217;ll be in the room with Charles for the HoT Session F, but I&#8217;m also doing a presentation as well as participating on the EPM/BI Experts Panel moderated by Natalie Delemar. This was interesting to me in that it apparently takes four Hyperion/Essbase panelists to adequately represent that side of the house, but only one OBIEE expert is required: me. All joking aside, I know this panel will be heavily attended by the Hyperion-minded, and I may be a straw man for frustration they&#8217;ve had with OBIEE over the years, but I&#8217;m ready! There&#8217;s a much better story to tell now with OBIEE 11g and OLAP, and hopefully they&#8217;ll at least give me a chance to speak before they start throwing the fruit:</p>
<p><strong>Tuesday: 11:15am &#8211; 12:15pm, Room 102AB: EPM/BI Expert Panel (Panelist)<br />
Wednesday: 1:45pm &#8211; 5:15pm, Hyatt Seaview: HoT Session F, Oracle BI 11g Answers and Dashboards (Hands-On Training)<br />
Thursday: 8:30am &#8211; 9:30am (Seriously?), 101A: Agile Data Warehousing with Exadata and OBIEE11g (Presentation)</strong></p>
<p>As usual, we are always happy to speak to attendees, customers or other community members, so if you see Mark, Charles or myself around, feel free to stop us and say hello. In case you&#8217;re unsure&#8230; I&#8217;m the short one.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2011/06/rittman-mead-at-odtug-kscope-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Real-time BI: Federated OLTP/EDW Reporting</title>
		<link>http://www.rittmanmead.com/2011/05/real-time-bi-federated-oltpedw-reporting/</link>
		<comments>http://www.rittmanmead.com/2011/05/real-time-bi-federated-oltpedw-reporting/#comments</comments>
		<pubDate>Mon, 16 May 2011 16:42:41 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[BI (General)]]></category>
		<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Dimensional Modelling]]></category>
		<category><![CDATA[Oracle BI Suite EE]]></category>
		<category><![CDATA[Oracle Database]]></category>
		<category><![CDATA[Oracle Warehouse Builder]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=8243</guid>
		<description><![CDATA[The typical approach in Federated OLTP/EDW reporting environments is to use a BI tool such as OBIEE to do horizontal federation. This means combining data from multiple sources at the same grain in a single logical table. One note of clarification: my use of the word &#8220;federated&#8221; might be a misnomer, and I apologize in [...]]]></description>
			<content:encoded><![CDATA[<p>The typical approach in Federated OLTP/EDW reporting environments is to use a BI tool such as OBIEE to do horizontal federation. This means combining data from multiple sources at the same grain in a single logical table. One note of clarification: my use of the word &#8220;federated&#8221; might be a misnomer, and I apologize in advance. As I argued in the <a href="http://www.rittmanmead.com/2011/05/real-time-bi-an-introduction/">last post</a>, the best practice for performance reasons is to actually stream, or &#8220;GoldenGate&#8221; the source system data to a foundation layer on the data warehouse instance. But old habits die hard, so I&#8217;ll continue to refer to this as &#8220;federation&#8221; even though it may not be technically accurate. Thanks for the latitude.</p>
<p>One of the sources for federation is a classic, batch-loaded EDW, with ETL processes that load conformed dimension tables, followed by fact tables that store the measures and calculations for the enterprise. Oracle Warehouse Builder (OWB), the ETL tool built inside the Oracle Database, is a standard choice for data warehouses built on the Oracle Database, and below, I show a sample process flow of what that batch load might look like:</p>
<p><img style="margin-left:auto;margin-right:auto" src="http://www.rittmanmead.com/wp-content/uploads/2011/05/batch-DW.png" alt="Batch DW" border="0" width="600" height="326" /></p>
<p>Logical table sources (LTS’s) are a key feature within the OBIEE semantic model but are often misunderstood. Each LTS represents a single location for data to exist for either a logical fact table, or logical dimension table. A logical table in the BMM can have multiple LTS’s for any of the following reasons:</p>
<p>1. Including different table sources into a single logical table at different levels of granularity. Tables containing data pre-aggregated at a different level in a hierarchy is a common example of this scenario, and is known as &#8220;vertical fragmentation&#8221;.</p>
<p>2. Including different table sources into a single logical table at the same level of granularity. Having data exist in two different locations, but wanting them to be combined in particular situations, is a common example of this scenario, and is known as &#8220;horizontal fragmentation&#8221;.</p>
<p>Using horizontal fragmentation in OBIEE, we can map a single logical fact table to multiple LTS’s. For example, suppose we had a physical fact table in our EDW called SALES_FACT. To represent that fact table in the semantic model, we would create a logical fact table in the BMM — called “Sales Fact Realtime” in this example — and create an LTS that maps to the SALES_FACT table. We would also map another LTS which presents this data in the source system as well. As the source system is transactional and likely exists in third-normal form (3NF), the LTS that maps to the transactional schema would likely not be a simple one-to-one relationship. In 3NF, we would likely have to join multiple tables in our source system to represent the logical fact table Sales Fact Realtime:</p>
<p><img style="margin-left:auto;margin-right:auto" src="http://www.rittmanmead.com/wp-content/uploads/2011/05/source-to-target-fact.png" alt="Source to target fact" border="0" width="600" height="270" /></p>
<p>We would have to do something comparable with the Customer Dimension:</p>
<p><img style="margin-left:auto;margin-right:auto" src="http://www.rittmanmead.com/wp-content/uploads/2011/05/source-to-target-dimension.png" alt="Source to target dimension" border="0" width="600" height="297" /></p>
<p>With the two LTS&#8217;s, we still need to configure the horizontal fragmentation. For this implementation, I have configured a repository variable called RV_REALTIME_THRESHOLD_DT, with an initialization block that keeps the value consistently at TRUNC(SYSTDATE). I use this variable as the threshold between reporting against the EDW schema and the source system schema.</p>
<p><img style="margin-left:auto;margin-right:auto" src="http://www.rittmanmead.com/wp-content/uploads/2011/05/init-block1.png" alt="Init block" border="0" width="530" height="439" /></p>
<p>Once I have the variable available, I can configure the fragmentation on the fact table to use the threshold to determine the appropriate source for a particular record. This is less complicated with the EDW LTS&#8230; simple fragmentation configured for all rows with a transaction date less than the threshold date:</p>
<p><img style="margin-left:auto;margin-right:auto" src="http://www.rittmanmead.com/wp-content/uploads/2011/05/fragmentation-EDW.png" alt="Fragmentation EDW" border="0" width="432" height="508" /></p>
<p>Whereas only the source system contains the newer rows needed for layering in real-time data&#8230; both the EDW and the source system contain historic data, albeit the EDW data is likely transformed to a certain degree. So we have to configure fragmentation using the RV_REALTIME_THRESHOLD_DT variable, but we also have to use that variable as a filter on the source system LTS to make sure we don&#8217;t over allocate the data.</p>
<p><img style="margin-left:auto;margin-right:auto" src="http://www.rittmanmead.com/wp-content/uploads/2011/05/fragmentation-OLTP.png" alt="Fragmentation OLTP" border="0" width="436" height="507" /></p>
<p>What’s the result of all this complex mapping among different LTS’s in the BMM? OBIEE understands that each source schema is completely segmented, and the tables in each LTS never join to tables in the other LTS… but they do union. OBIEE will construct a complete query against the transactional schema, in this example, joining between the CUSTOMER_DEMOG_TYPES, CUSTOMERS, POS_TRANS and POS_TRANS_HEADER tables. Additionally, OBIEE will construct another complete query against the EDW schema, in this case, only the tables SALES_FACT and CUSTOMER_DIM. The BI Server then logically unions the results between the two source schemas into a single result set that is returned whenever a user builds a report against the logical tables Customer Dim and Sales Fact Realtime. So I run the following report against my fragmented Sales Fact Realtime:</p>
<p><img style="margin-left:auto;margin-right:auto" src="http://www.rittmanmead.com/wp-content/uploads/2011/05/high-level-report-federated.png" alt="High level report federated" border="0" width="461" height="468" /></p>
<p>The interesting part is how OBIEE does the logical union. When the EDW and the transactional schema exist in separate databases, the BI Server issues two different database queries and combines them into a single result set in its own memory space. However, if the schemas exist within the same database, as the Oracle Next-Generation Reference Architecture recommends, then the BI Server is able to issue a single query, transforming the logical union into an actual physical union in the SQL statement, as demonstrated in the statement below. Notice that the SQL threshold has been applied, and the UNION was constructed with a single SQL statement pushed down from the BI Server to the Oracle Database holding the Foundation and Presentation and Access layers in our Oracle architecture:</p>
<pre>
WITH
SAWITH0 AS (select T44105.AMOUNT as c1,
     T44042.CUSTOMER_LAST_NAME as c2,
     T48199.CALENDAR_MONTH_NUMBER as c3,
     T48199.CALENDAR_YEAR as c4,
     T48199.SQL_DATE as c5
from
     GCBC_EDW.DATE_DIM T48199 /* CONFORMED_DATE_DIM */ ,
     GCBC_EDW.CUSTOMER_DIM T44042,
     GCBC_EDW.SALES_FACT T44105
where  ( T44042.CUSTOMER_KEY = T44105.CUSTOMER_KEY and T44105.SALES_DATE_KEY = T48199.DATE_KEY ) ),
SAWITH1 AS (select T43971.SAL_AMT as c1,
     T43901.CUST_LAST_NAME as c2,
     T48199.CALENDAR_MONTH_NUMBER as c3,
     T48199.CALENDAR_YEAR as c4,
     T48199.SQL_DATE as c5
from
     GCBC_EDW.DATE_DIM T48199 /* CONFORMED_DATE_DIM */ ,
     GCBC_CRM.CUSTOMERS T43901,
     GCBC_POS.POS_TRANS T43971,
     GCBC_POS.POS_TRANS_HEADER T43978
where  ( T43901.CUST_ID = T43978.CUST_ID
         and T43971.TRANS_ID = T43978.TRANS_ID
         <strong>and T48199.DATE_KEY =  TRUNC(T43978.TRANS_DATE)
         and T43978.TRANS_DATE &gt;= TO_DATE('2011-05-16 00:00:00' , 'YYYY-MM-DD HH24:MI:SS') </strong>
       )),
SAWITH2 AS ((select concat(D0.c4, D0.c3) as c2,
     D0.c5 as c3,
     D0.c2 as c4,
     D0.c1 as c5
from
     SAWITH0 D0
union all
select concat(D0.c4, D0.c3) as c2,
     D0.c5 as c3,
     D0.c2 as c4,
     D0.c1 as c5
from
     SAWITH1 D0)),
SAWITH3 AS (select sum(D3.c5) as c1,
     D3.c2 as c2,
     D3.c3 as c3,
     D3.c4 as c4
from
     SAWITH2 D3
group by D3.c2, D3.c3, D3.c4)
select distinct 0 as c1,
     D2.c2 as c2,
     D2.c3 as c3,
     D2.c4 as c4,
     D2.c1 as c5
from
     SAWITH3 D2
order by c2, c4, c3
</pre>
<p>But OBIEE is also capable of doing the fragmentation equivalent of &#8220;partition pruning.&#8221; When the BI Server has enough information to know that the entire result set will come from a single source, then the SQL will be issued against only one of the LTS&#8217;s. For instance, if I click on one of the &#8220;SQL Date&#8221; attributes in the above report which will apply a filter on the fragmentation column, the BI Server will know that the result set only comes from the EDW:</p>
<pre>WITH
SAWITH0 AS (select sum(T44105.AMOUNT) as c1,
     concat(T48199.CALENDAR_YEAR, T48199.CALENDAR_MONTH_NUMBER) as c2,
     T48199.DATE_KEY as c3,
     T48199.SQL_DATE as c4,
     T44042.CUSTOMER_LAST_NAME as c5
from
     GCBC_EDW.DATE_DIM T48199 /* CONFORMED_DATE_DIM */ ,
     GCBC_EDW.CUSTOMER_DIM T44042,
                   GCBC_EDW.SALES_FACT T44105
where  ( T44042.CUSTOMER_KEY = T44105.CUSTOMER_KEY
         and T44042.CUSTOMER_LAST_NAME = 'Carr'
         and T44105.SALES_DATE_KEY = T48199.DATE_KEY
         <strong>and T48199.SQL_DATE = TO_DATE('2009-07-03' , 'YYYY-MM-DD')</strong>
         and concat(T48199.CALENDAR_YEAR, T48199.CALENDAR_MONTH_NUMBER) = '200907' )
group by T44042.CUSTOMER_LAST_NAME,
         T48199.DATE_KEY,
         T48199.SQL_DATE,
         concat(T48199.CALENDAR_YEAR, T48199.CALENDAR_MONTH_NUMBER))
select distinct 0 as c1,
     D1.c2 as c2,
     D1.c3 as c3,
     D1.c4 as c4,
     D1.c5 as c5,
     D1.c1 as c6
from
     SAWITH0 D1
order by c2, c5, c4, c3</pre>
<p>Before closing this section of the real-time discussion, I want to take a minute to identify the strengths and weaknesses of this approach. As far as strengths go, we have several items that register with this solution. First off&#8230; this is a low-latency solution. When using the Oracle Next-Generation Reference Architecture, we have the latency of streaming, or &#8220;GoldenGating,&#8221; the content from the source system to the DW database. With clients we&#8217;ve had in the past, this can run anywhere from a few seconds to several minutes, depending on the solution implemented. Additionally, there is no complex logical or physical data modeling and supporting ETL to deliver this solution, as there is with the EDW with a Real-Time Component, which we will explore in the next posting.</p>
<p>As far as weaknesses go, there will be a fair amount of complex RPD semantic-layer modeling. Obviously, the degree of difficulty depends on a number of factors: number of source systems integrated, number of subject areas, complexity of reports delivered, etc. Also, increased complexity of RPD modeling may introduce performance degradation as OLTP schemas have to be transformed &#8220;on the fly&#8221; to star schemas by the BI Server. But keep in mind&#8230; we are typically only doing this for at most a day&#8217;s worth of data, so with proper database tuning, this content can usually perform quite well.</p>
<p>Next up: EDW with a Real-Time Component</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2011/05/real-time-bi-federated-oltpedw-reporting/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

