<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Rittman Mead Consulting &#187; Methodology</title>
	<atom:link href="http://www.rittmanmead.com/category/methodology/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.rittmanmead.com</link>
	<description>Delivering Oracle Business Intelligence</description>
	<lastBuildDate>Mon, 06 Feb 2012 21:18:16 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Agile Data Warehousing with Exadata and OBIEE: ETL Iteration</title>
		<link>http://www.rittmanmead.com/2012/01/agile-exadata-obiee-etl/</link>
		<comments>http://www.rittmanmead.com/2012/01/agile-exadata-obiee-etl/#comments</comments>
		<pubDate>Fri, 27 Jan 2012 04:22:31 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[BI (General)]]></category>
		<category><![CDATA[BI 2.0]]></category>
		<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Dimensional Modelling]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Oracle BI Suite EE]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=9954</guid>
		<description><![CDATA[This is the fourth entry in my series on Agile Data Warehousing with Exadata and OBIEE. To see all the previous posts, check the introductory posting which I have updated with all the entries in the series. In the last post, I describe what I call the Model-Driven iteration, where we take thin requirements from the [...]]]></description>
			<content:encoded><![CDATA[<p>This is the fourth entry in my series on Agile Data Warehousing with Exadata and OBIEE. To see all the previous posts, check the <a title="Agile Data Warehousing with Exadata and OBIEE: Introduction" href="http://www.rittmanmead.com/2011/12/agile-data-warehousing-with-exadata-and-obiee-introduction/">introductory posting</a> which I have updated with all the entries in the series.</p>
<p>In the last post, I describe what I call the Model-Driven iteration, where we take thin requirements from the end-user in the form of a user story and generate the access and performance layer, or our star schema, logically using the OBIEE semantic model. Our first several iterations will likely be Model-Driven as we work with the end user to fine-tune the content he or she wants to see on the OBIEE dashboards. As user stories are opened, completed and validated throughout the project, end users are prioritizing them for the development team to work on. Eventually, there will come a time when an end user opens a story that is difficult to model in the semantic layer. Processes to correct data quality issues are a good example, and despite having the power of Exadata at our disposal, we may find ourselves in a performance hole that even the Database Machine can&#8217;t dig us out of. In these situations, we reflect on our overall solution and consider the maxim of Agile methodology: &#8220;refactoring&#8221;, or &#8220;rework&#8221;.</p>
<p>For Extreme BI, the main form of refactoring is ETL. The pessimist might say: &#8220;Well, now we have to do ETL development, what a waste of time all that RPD modeling was.&#8221; But is that the case? First off&#8230; think about our users. They have been running dashboards for some time now with at least a portion of the content they need to get their jobs done. As the die-hard Agile proponent will tell you&#8230; some is better than none. But also&#8230; the process of doing the Model-Driven iteration puts our data modelers and our ETL developers in a favorable position. We&#8217;ve eliminated the exhaustive data modeling process, because we already have our logical model in the Business Model and Mapping layer (BMM).</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Full-Logical-Model.png"><img class="alignnone size-large wp-image-9976" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Full-Logical-Model-1024x559.png" alt="" width="614" height="335" /></a></p>
<p>But we have more than that. We also have our source-to-target information documented in the semantic metadata layer. We can see that information using the Admin Tool, as depicted below, or we can also use the &#8220;Repository Documentation&#8221; option to generate some documented source-to-target mappings.</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Map-Dimension.png"><img class="size-full wp-image-9883  alignnone" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Model-Driven-Map-Dimension.png" alt="" width="671" height="219" /></a></p>
<p>When embarking on ETL development, it&#8217;s common to do SQL prototyping before starting the actual mappings to make sure we understand the particulars of granularity. However, we already have these SQL prototypes in the nqquery.log file&#8230; all we have to do is look at it. The combination of the source-to-target-mapping and the SQL prototypes provide all the artifacts necessary to get started with the ETL.</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Query-Log.png"><img class="alignnone size-large wp-image-9982" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Query-Log-1024x598.png" alt="" width="645" height="377" /></a></p>
<p>When using ETL processing to &#8220;instantiate&#8221; our logical model into the physical world, we can&#8217;t abandon our Agile imperatives: we must still deliver the new content, and corresponding rework, within a single iteration. So whether the end user is opening the user story because the data quality is abysmal, or because the performance is just not good enough, we must vow to deliver the ETL Iteration time-boxed, in exactly the same manner that we delivered the Model-Driven Iteration. So, if we imagine that our user opens a story about data quality in our Customer and Product dimensions, and we decide that all we have time for in this iteration are those two dimension tables, does it make sense for us to deliver those items in a vacuum? With the image below depicting the process flow for an entire subject area, can we deliver it piecemeal instead of all at once?</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Piecemeal-Process-Flow.png"><img class="alignnone size-full wp-image-9968" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Piecemeal-Process-Flow.png" alt="" width="636" height="348" /></a></p>
<p>The answer, of course, is that we can. We&#8217;ll develop the model and ETL exactly as we would if our goal was to plug the dimensions into a complete subject area. We use surrogate keys as the primary key for each dimension table, facilitating joining our dimension tables to completed fact tables. But we don&#8217;t have completed fact tables at this point in our project&#8230; instead we have a series of transaction tables that work together to form the basis of a logical fact table. How can we use a dimension table with a surrogate key to join to our transactional &#8220;fact&#8221; table that doesn&#8217;t yet have these surrogate keys?</p>
<p>We fake it. Along with surrogate keys, the long-standing best practice of dimension table delivery has been to include the source system natural key, as well as effective dates, in all our dimension tables. These attributes are usually included to facilitate slowly-changing dimension (SCD) processing, but we&#8217;ll exploit them for our Agile piecemeal approach as well. So in our example below, we have a properly formed Customer dimension that we want to join to our logical fact table, as depicted below:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Partial-Hybrid-Model-e1327470743307.png"><img class="alignnone size-full wp-image-9995" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Partial-Hybrid-Model-e1327470743307.png" alt="" width="596" height="200" /></a></p>
<p>We start by creating aliases to our transactional &#8220;fact&#8221; tables (called POS_TRANS_HYBRID and POS_TRANS_HEADER_HYBRID in the example above), because we don&#8217;t want to upset the logical table source (LTS) that we are already using for the pure transactional version of the logical fact table. We create a complex join between the customer source system natural key and transaction date in our hybrid alias, and the natural key and effective dates in the dimension table. We use the effective dates as well to make sure we grab the correct version of the customer entity in question in situations where we have enabled Type 2 SCD&#8217;s (the usual standard) in our dimension table.</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Surrogate-Pipeline.png"><img class="alignnone size-large wp-image-10007" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Surrogate-Pipeline-1024x869.png" alt="" width="574" height="486" /></a></p>
<p>This complex logic of using the natural key and effective dates is identical to the logic we would use in what Ralph Kimball calls the &#8220;surrogate pipeline&#8221;: the ETL processing used to replace natural keys with surrogate keys when loading a proper fact table. Using Customer and Sales attributes in an analysis, we can see the actual SQL that&#8217;s generated:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Surrogate-Pipeline-SQL.png"><img class="alignnone size-large wp-image-10025" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Surrogate-Pipeline-SQL-1024x510.png" alt="" width="645" height="321" /></a></p>
<p>We can view this hybrid approach as an intermediate step, but there is also nothing wrong with this as a long-term approach if the users are happy and Exadata makes our queries scream. If you think about it&#8230; a surrogate key is an easy was of representing the natural key of the table, which is the source system natural key plus the unique effective dates for the entity. A surrogate key makes this relationship much easier to envision, and certainly code using SQL, but when we are insulated from the ugliness of the join with Extreme Metadata, do we really care? If our end users ever open a story asking for rework of the fact table, we may consider manifesting that table physically as well. Once complete, we would need to create another LTS for the Customer dimension (using an alias to keep it separate from the table that joins to the transactional tables). This alias would be configured to join directly to the new Sales fact table across the surrogate key&#8230; exactly how we would expect a traditional data warehouse to be modeled in the BMM. The physical model will look nearly identical to our logical model, and the generated SQL will be less interesting:</p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Fact-LTS.png"><img class="alignnone size-full wp-image-10033" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Fact-LTS.png" alt="" width="221" height="226" /></a></p>
<p><a href="http://www.rittmanmead.com/wp-content/uploads/2012/01/Star-Schema-SQL.png"><img class="alignnone size-large wp-image-10029" src="http://www.rittmanmead.com/wp-content/uploads/2012/01/Star-Schema-SQL-1024x420.png" alt="" width="645" height="265" /></a></p>
<p>Now that I&#8217;ve described the Model-Driven and ETL Iterations, it&#8217;s time to discuss what I call the Combined Iteration, which is likely what most of the iterations will look like when the project has achieved some maturity. In Combined Iterations, we work on adding new or refactored RPD content alongside new or refactored ETL content in the same iteration. Now the project really makes sense to the end user. We allow the user community&#8211;those who are actually consuming the content&#8211;to dictate to the developers with user stories what they want the developers to work on in the next iteration. The users will constantly open new stories, some asking for new content, and others requesting modifications to existing content. All Agile methodologies put the burden of prioritizing user stories squarely on the shoulders of the user community. Why should IT dictate to the user community where priorities lie? If we have delivered fabulous content sourced with the Model-Driven paradigm, and Exadata provides the performance necessary to make this &#8220;real&#8221; content, then there is no reason for the implementors to dictate to the users the need to manifest that model physically with ETL when they haven&#8217;t asked for it. If whole portions of our data warehouse are never implemented physically with ETL&#8230; do we care? The users are happy with what they have, and they think performance is fine&#8230; do we still force a &#8220;best practice&#8221; of a physical star schema on users who clearly don&#8217;t want it?</p>
<p>So that&#8217;s it for the Extreme BI methodology. At the onset of this series&#8230; I thought it would require five blog posts to make the case, but I was able to do it in four instead. So even when delivering blog posts, I can&#8217;t help but rework as I go along. Long live Agile!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2012/01/agile-exadata-obiee-etl/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Agile Data Warehousing with Exadata and OBIEE: Puzzle Pieces</title>
		<link>http://www.rittmanmead.com/2011/12/agile-exadata-obiee-puzzle-pieces/</link>
		<comments>http://www.rittmanmead.com/2011/12/agile-exadata-obiee-puzzle-pieces/#comments</comments>
		<pubDate>Wed, 28 Dec 2011 19:39:56 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Dimensional Modelling]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Oracle BI Suite EE]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=9637</guid>
		<description><![CDATA[In the previous post, I laid the groundwork for describing Extreme BI: a combination of Exadata and OBIEE delivered with an Agile spirit. I discussed that the usual approach to Agile data warehousing is not Agile at all due to the violation of it&#8217;s main principle: working software delivered iteratively. If you haven&#8217;t already deduced [...]]]></description>
			<content:encoded><![CDATA[<p>In the <a title="Agile Data Warehousing with Exadata and OBIEE: Introduction" href="http://www.rittmanmead.com/2011/12/agile-data-warehousing-with-exadata-and-obiee-introduction/" target="_blank">previous post</a>, I laid the groundwork for describing Extreme BI: a combination of Exadata and OBIEE delivered with an Agile spirit. I discussed that the usual approach to Agile data warehousing is not Agile at all due to the violation of it&#8217;s main principle: working software delivered iteratively.</p>
<p>If you haven&#8217;t already deduced from my first post &#8212; or if you haven&#8217;t already seen me speak on this topic &#8212; what I am recommending is bypassing, either temporarily or permanently, the inhibitors specific to data warehousing projects which limit our ability to deliver working software quickly. Specifically, I&#8217;m recommending that we wait to build and populate physical star schemas until a later phase, if at all. Remember the two reasons that we build dimensional models: model simplicity and performance. With our Extreme BI solution, we have tools to counter both of those reasons. We have OBIEE 11g, with a rich metadata layer that presents our underlying data model, even if it is transactional, as a star schema to the end user. This removes our dependency on a simplistic physical model to provide a simplistic logical model to end users. We also have Exadata, which delivers world-class performance against any type of model, and can bridge the performance gap afforded by star schemas. With these tools at our disposal, we can postpone the long process of building dimensional models, at least for the first few iterations. This is the only way to get working software in front of the end user in a single iteration, and, as I will argue, this is the best way to collaborate with an end user and deliver the content they are expecting.</p>
<p>Of the puzzle pieces we need to deliver this model, the first is the <a href="http://www.rittmanmead.com/wp-content/uploads/2011/12/058925.pdf" target="_blank">Oracle Next-Generation Reference DW Architecture</a> (we need an acronym for that), which Mark has already written about in-depth <a title="Drilling Down in the Oracle Next-Generation Reference DW Architecture" href="http://www.rittmanmead.com/2009/07/drilling-down-in-the-oracle-next-generation-reference-dw-architecture/" target="_blank">here</a>. As you browse through this post, pay special attention to his formulation of the foundation layer, which is the most important layer for delivering Extreme BI.</p>
<div id="attachment_9672" class="wp-caption aligncenter" style="width: 673px"><a href="http://www.rittmanmead.com/wp-content/uploads/2011/12/next-gen.png"><img class="size-large wp-image-9672    " src="http://www.rittmanmead.com/wp-content/uploads/2011/12/next-gen-1024x627.png" alt="" width="663" height="407" /></a><p class="wp-caption-text">Oracle Next-Generation Reference DW Architecture</p></div>
<h2>Foundation Layer</h2>
<p>This is our &#8220;process-neutral&#8221; layer, which means simply that it isn&#8217;t imbued with requirements about what users want and how they want it. Instead, the foundation layer has one job and one job only: tracking what happened in our source systems. Typically, the foundation layer logical model looks identical to the source systems, except that we have a few additional metadata columns on each record such as commit timestamps and Oracle Database system change numbers (SCN&#8217;s). There are other, more complex solutions for modeling the foundation layer when the 3NF from the source system or systems is not sufficient, such as <a title="Data Vault Modeling" href="http://en.wikipedia.org/wiki/Data_Vault_Modeling" target="_blank">data vault</a>. Our foundation layer is generally &#8220;insert-only&#8221;, meaning we track all history so that we are insulated from changing user requirements in the near and distant futures.</p>
<p><strong>UPDATE: </strong> Kent Graziano, a major data vault evangelist, has started <a title="Oracle Data Warrior" href="http://kentgraziano.com/" target="_blank">blogging</a>. Perhaps with some pressure from the public, we could &#8220;encourage&#8221; him to blog on what data vault would look like in a standard foundation layer.</p>
<h2>Capturing Change</h2>
<p>Also required for delivering Extreme BI is a process for capturing change from the source systems and rapidly applying it to the foundation layer, which I described briefly in one of my posts on <a title="Real-time BI: An Introduction" href="http://www.rittmanmead.com/2011/05/real-time-bi-an-introduction/" target="_blank">real-time data warehousing</a>. We have a bit of a tug-of-war at this point between Oracle Streams and Oracle GoldenGate. GoldenGate is the stated platform of the future because it’s a simple, flexible, powerful and resilient replication technology. However, it does not yet have powerful change data capture functionality specific to data warehouses, such as easy subscriptions to raw changed data, or support for multiple subscription groups. You can, in general, work around these limitations using the INSERTALLRECORDS parameter and some custom code (perhaps fodder for a future blog post). Regardless of the technology, Extreme BI requires a process for capturing and applying source system changes quickly and efficiently to the foundation layer on the Exadata Database Machine.</p>
<h2>Extreme Performance</h2>
<p>Although I&#8217;ll drill into more detail in the next post, the reason we need Extreme Performance is to offset the performance gains we usually get from star schemas, since we won&#8217;t be building those, at least not in the initial iterations. Although Rittman Mead has deployed a variant of this methodology sans Exadata using a powerful Oracle Database RAC instead, there is no substitute for Exadata. Although the hardware on the Database Machine is superb, it&#8217;s really the software that is a game-changer. The most extraordinary features include <a title="Smart Scans Meet Storage Indexes" href="http://www.oracle.com/technetwork/issue-archive/2011/11-may/o31exadata-354069.html" target="_blank">smart scan and storage indexes</a>, as well as hybrid columnar compression, which Mark talks about <a title="Hybrid Columnar Compression in Oracle Exadata v2" href="http://www.rittmanmead.com/2010/01/hybrid-columnar-compression-in-oracle-exadata-v2/" target="_blank">here</a> and references an article by Arup Nanda found <a title="Compressing Columns" href="http://www.oracle.com/technetwork/issue-archive/2010/10-jan/o10compression-082302.html" target="_blank">here</a>. For years now, with standard Oracle data warehouses, we&#8217;ve pushed the architecture to it&#8217;s limits trying to reduce IO contention at the cost of CPU utilization, using database features such as partitioning, parallel query and basic block compression. But Exadata Storage can eliminate the IO boogeyman using combinations of these standard features plus the Exadata-only features to elevate the query performance against 3NF schemas on par with traditional star schemas and beyond.</p>
<p style="text-align: center"><a href="http://www.rittmanmead.com/wp-content/uploads/2011/12/Terabytes-to-Gigabytes.png"><img class="aligncenter size-full wp-image-9739" src="http://www.rittmanmead.com/wp-content/uploads/2011/12/Terabytes-to-Gigabytes.png" alt="" width="617" height="352" /></a></p>
<h2>Extreme Metadata</h2>
<p>Extreme performance is only half the battle&#8230; we also need Extreme Metadata to provide us the proper level of abstraction so that report and dashboard developers still have a simplistic model to report against. This is what OBIEE 11g brings to the table. We have also delivered a variant of this methodology without OBIEE, using Cognos instead, which has a metadata layer called <a title="Framework Manager" href="http://www.ironsidegroup.com/2010/07/08/best-practices-in-cognos-8-framework-manager-model-design/" target="_blank">Framework Manager</a>. As with Exadata, the BI Server has no equal in the metadata department, so my advice&#8230; don&#8217;t substitute ingredients.</p>
<p>Consider, for a moment, the evolution of dimensional modeling in deploying a data warehouse. Not too long ago, we had to solve most data warehousing issues with the logical model because BI tools were simplistic. Generally&#8230; there was no abstraction of the physical into the logical, unless you categorize the renaming of columns as abstraction. As these tools evolved, we often found ourselves with a choice: solve some user need in the logical model, or solve it with the feature set of the BI tool. The use of aggregation in data warehousing is a perfect example of this evolution. Designing aggregate tables used to be just another part of the logical modeling exercise, and were generally represented in the published data model for the EDW. But now, building aggregates is more of a technical implementation than a logical one, as either the BI Server or the Oracle Database can handle the transparent navigation to aggregate tables.</p>
<p>The metadata that OBIEE provides adds two necessary features for Agile delivery. First, we are able to report against complex transactional schemas, but still expose those schemas as simplified dimensional models. This allows us to bypass the complex ETL process at least initially so that we can get new subject areas into the users hands in a single iteration. But OBIEE&#8217;s capability to map multiple Logical Table Sources (LTS&#8217;s) for the same logical table makes it easy to modify &#8212; or &#8220;remap&#8221; &#8212; the source of our logical tables over time. So, in later iterations, if we decide that it&#8217;s necessary to embark upon complex ETL processes to complete user stories, we can do this in the metadata layer without affecting our reports and dashboards, or changing the logical model that report developers are used to seeing.</p>
<div id="attachment_9754" class="wp-caption aligncenter" style="width: 612px"><a href="http://www.rittmanmead.com/wp-content/uploads/2011/12/semantic-model.031.png"><img class="size-full wp-image-9754 " src="http://www.rittmanmead.com/wp-content/uploads/2011/12/semantic-model.031.png" alt="" width="602" height="378" /></a><p class="wp-caption-text">Flow of Data Through the Three-Layer Semantic Model</p></div>
<h2>More to Come&#8230;</h2>
<p>In the next post, I&#8217;ll describe what I call the Model-Driven Iteration, where we use OBIEE against the foundation layer to expose new subject areas in a single iteration. After that, I&#8217;ll describe ETL Iterations, where we transform a portion of our model iteratively using ETL tools such as ODI, OWB or Informatica. Finally, I&#8217;ll describe what I call Combined Iterations, where both Model-Driven activity and ETL activity are going on at the same time.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2011/12/agile-exadata-obiee-puzzle-pieces/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Agile Data Warehousing with Exadata and OBIEE: Introduction</title>
		<link>http://www.rittmanmead.com/2011/12/agile-data-warehousing-with-exadata-and-obiee-introduction/</link>
		<comments>http://www.rittmanmead.com/2011/12/agile-data-warehousing-with-exadata-and-obiee-introduction/#comments</comments>
		<pubDate>Wed, 21 Dec 2011 15:48:55 +0000</pubDate>
		<dc:creator>Stewart Bryson</dc:creator>
				<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Dimensional Modelling]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Oracle BI Suite EE]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=9597</guid>
		<description><![CDATA[Over the last year, I&#8217;ve been speaking at conferences on one subject more than any others: Agile Data Warehousing with Exadata and OBIEE. Although I&#8217;ve been busy with client work and growing the US business, I realize I need to dedicate more time to blogging again, and this seemed like the logical subject to take [...]]]></description>
			<content:encoded><![CDATA[<p>Over the last year, I&#8217;ve been speaking at conferences on one subject more than any others: Agile Data Warehousing with Exadata and OBIEE. Although I&#8217;ve been busy with client work and growing the US business, I realize I need to dedicate more time to blogging again, and this seemed like the logical subject to take up. So I&#8217;ll use the next few blog posts to make my case for what I like to call Extreme BI: an Agile approach to data warehousing using the combination of Extreme Performance and Extreme Metadata.</p>
<p>In a standard data warehouse implementation, whether we are walking in the Inmon or Kimball camps, some portion of our data model will be dimensional in nature; a star schema with facts and dimensions. So let me pose a question, which I think will lend itself well to diving into the Extreme BI discussion: Why do we build dimensional models? The first reason is simplicity. We want to model our reporting structures in a way that makes sense to the business user. The standard OLTP data model that takes two of the four walls in the conference room to display is just never going to make sense to your average business user. At the end of a logical modeling exercise, I expect the end-user to have a look at a completed dimensional model and say: &#8220;Yep&#8230; that&#8217;s our business alright&#8221;. The second reason we build dimensional models is for performance. Denormalizing highly complex transactional models into simplified star schemas generally produces tremendous performance gains.</p>
<p>So my follow-up question: can the combination of Exadata and OBIEE, or Extreme BI, <em>actually change the way we deliver projects? </em>We&#8217;ve all seen the Exadata performance numbers that Oracle publishes, and I can tell you first hand the performance is impressive. Can this Extreme Performance combined with the Extreme Metadata that OBIEE provides give us a more compelling case for delivering data warehouses using Agile methodologies?</p>
<p>To start with, I&#8217;d like to paint a picture of what the typical waterfall data warehousing project looks like. The tasks we usually have to complete, in order, are the following:</p>
<ol>
<li>User interviews</li>
<li>Construct requirement documents</li>
<li>Create logical data model</li>
<li>SQL prototyping of source transactional models</li>
<li>Document source-to-target mappings</li>
<li>ETL development</li>
<li>Front-end development (analyses and dashboards)</li>
<li>Performance tuning</li>
</ol>
<p>Raise your hand if this looks familiar. We would have to go through all these steps, which could take months, before end users can see the fruits of our labor. To mitigate this scenario, organizations will attempt to deliver data warehouses using &#8220;Agile&#8221; methodologies. What this usually means, from my experience, is a simple repackaging of the same waterfall project plan into &#8220;iterations&#8221; or &#8220;sprints&#8221;, so that the project can be delivered iteratively. So the process might look like the following:</p>
<ol>
<li>Iteration 1: Interviews and user requirements</li>
<li>Iteration 2: Logical modeling</li>
<li>Iteration 3: ETL Development</li>
<li>Iteration 4: Front-end development</li>
</ol>
<p>But this, ladies and gentlemen, is not Agile. To get an understanding of what lies at the heart of Agile development, we need to look no further than the <a title="The Agile Manisfesto" href="http://agilemanifesto.org/" target="_blank">Agile Manifesto</a>, or the history of the <a title="The Agile Movement" href="http://en.wikipedia.org/wiki/Agile_software_development" target="_blank">Agile Movement</a>. When examining the different methodologies, there is one major theme that permeates all of them: working software delivered iteratively. It&#8217;s not enough to simply deliver the same old waterfall methodology in &#8220;sprints&#8221; or &#8220;iterations&#8221;, because, at the end of those iterations, we don&#8217;t have any working software&#8230; software that end users can actually use to improve their job or help them make better decisions. In the example above, we still require four iterations before we get any usable content. It doesn&#8217;t matter if we&#8217;ve written some complex ETL to load a fact table if the end user doesn&#8217;t have a working dashboard to go along with it.</p>
<p>To apply the Agile Manifesto to data warehouse delivery, it&#8217;s the following key elements that are required for us to deliver with a true Agile spirit:</p>
<ol>
<li>User stories instead of requirements documents: a user asks for particular content through a narrative process, and includes in that story whatever process they currently use to generate that content.</li>
<li>Time-boxed iterations: iterations always have a standard length, and we choose one or more user stories to complete in that iteration.</li>
<li>Rework is part of the game: there aren&#8217;t any missed requirements&#8230; only those that haven&#8217;t been addressed yet.</li>
</ol>
<p>I&#8217;ve been conscious not to prescribe any distinct Agile methodology, though I can&#8217;t help using more Scrum-like concepts in this formulation. However, I think this list is generic enough to apply to most methodologies. Over the next few posts, I&#8217;ll discuss the necessary puzzle pieces to engage in Extreme BI, as well as how we might implement new subject area content in a single iteration. Additionally, I&#8217;ll discuss how these implementations might be reworked, or &#8220;refactored&#8221;, over several iterations to produce data warehouses that respond to user stories: what users want and when they want it.</p>
<p><strong>Follow-up Posts</strong></p>
<p><a title="Agile Data Warehousing with Exadata and OBIEE: Puzzle Pieces" href="http://www.rittmanmead.com/2011/12/agile-exadata-obiee-puzzle-pieces/" target="_blank">Agile Data Warehousing with Exadata and OBIEE: Puzzle Pieces</a></p>
<p><a title="Agile Data Warehousing with Exadata and OBIEE: Model-Driven Iteration" href="http://www.rittmanmead.com/2012/01/agile-exadata-obiee-model-driven/">Agile Data Warehousing with Exadata and OBIEE: Model-Driven Iteration</a></p>
<p><a title="Agile Data Warehousing with Exadata and OBIEE: ETL Iteration" href="http://www.rittmanmead.com/2012/01/agile-exadata-obiee-etl/" target="_blank">Agile Data Warehousing with Exadata and OBIEE: ETL Iteration</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2011/12/agile-data-warehousing-with-exadata-and-obiee-introduction/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The One Mapping Paradigm</title>
		<link>http://www.rittmanmead.com/2011/04/the-one-mapping-paradigm/</link>
		<comments>http://www.rittmanmead.com/2011/04/the-one-mapping-paradigm/#comments</comments>
		<pubDate>Fri, 01 Apr 2011 05:00:59 +0000</pubDate>
		<dc:creator>Jon Mead</dc:creator>
				<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Oracle Data Integrator]]></category>
		<category><![CDATA[Oracle Warehouse Builder]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=7746</guid>
		<description><![CDATA[Here at Rittman Mead we have been working on some new methodology and design patterns for ETL. We have long realised that the bottleneck in  Business Intelligence and Data Warehousing projects is ETL, so we have been prototyping new techniques to approaching this and trialling them at client&#8217;s sites. Taking a step back and looking [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: left">Here at Rittman Mead we have been working on some new methodology and design patterns for ETL. We have long realised that the bottleneck in  Business Intelligence and Data Warehousing projects is ETL, so we have been prototyping new techniques to approaching this and trialling them at client&#8217;s sites.</p>
<p style="text-align: left">Taking a step back and looking at the ETL process we felt there was a lot of complexity unnecessarily created by decomposing the process into a number of program units or mappings. In our view this process creates the following problems:</p>
<ul>
<li>A large amount of processing time was wasted on the inter-communication of these mappings.</li>
<li>Unnessary temporary storage objects and created and populated in the database.</li>
<li>A separate technology is required to orchestrate all the mappings.</li>
<li>It encouraged multiple developers to work on the ETL process thereby increasing the risk of mis-communication and mis-aligned interfaces.</li>
</ul>
<p>In response to this Rittman Mead have developed the One Mapping Paradigm. We believe that you should put all your ETL code into one mapping, and as such have called this approach the One Mapping Paradigm (OMP). The goal of this approach is to encapsulate your entire ETL routine into one mapping or program unit.</p>
<p>We feel this approach adheres to some of the fundamental tennets of software development: encapsulation (everything is in the one mapping) and decoupling (there are no external dependencies). Further it completely negates the need for old bugbear re-usability, you now don&#8217;t even need to re-use code, just use it once, all in the same mapping. Most importantly OMP will also provides a reduction in development costs: you now only need one developer.</p>
<p>Our extensive research has also developed a series of steps you can follow to deliver your One Mapping. You should note that the One Mapping that OMP generates will be extremely complex, only by following these can you address the complexity of the mapping that will be generated.</p>
<p>OMP follows a black hole development approach where it is crucial for the developer to do as much development as possible without any outside interfere from either peers or the business. This allows the developer to focus solely on the development task in hand, which is a must when developing extremely complex code. It is also essential that the developer is allowed to proceed as far through the process as possible without stopping for other distracting activities like testing. In order to follow the OMP I have built the following example using Oracle Warehouse Builder.</p>
<ul>
<li><strong>Step 1:</strong> source objects &#8211; create new mapping a drag all your source objects onto the canvas &#8211; it is important to arrange these in a straight line on the left hand side of the canvas.</li>
<li><strong>Step 2:</strong> add all your join operators to combine the data. A couple of tips here, (1) add predicates into the join conditions to avoid using filter operators (2) keep the data transition lines as straight as possible for performance reasons.</li>
<li><strong>Step 3:</strong> add any expression or transformational operators required &#8211; these should really be added to the middle of the canvas.</li>
<li><strong>Step 4: </strong>add all your target tables &#8211; these are added to the right hand side of your canvas. You are in the home straight now, but you may find this the trickiest part and we recommend using at least a 29&#8243; monitor to complete this process.</li>
<li><strong>Step 5:</strong> unit test &#8211; note there is no orchestration or integration required, as you only have One Mapping.</li>
<li><strong>Step 6:</strong> release to production &#8211; you can just release you mapping straight into production, overwriting whatever was there before. There is no system or integration testing required as there is only one piece of code. UAT is further bypassed as your unit testing verifies whether the entire ETL process works or not.</li>
</ul>
<p>We are looking for beta testers for this concept, so if you want to try the OMP for your ETL code, please contact me at omp@rittmanmead.com.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2011/04/the-one-mapping-paradigm/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>To BICC Or Not To BICC (Part 3)</title>
		<link>http://www.rittmanmead.com/2009/12/to-bicc-or-not-to-bicc-part-3/</link>
		<comments>http://www.rittmanmead.com/2009/12/to-bicc-or-not-to-bicc-part-3/#comments</comments>
		<pubDate>Mon, 21 Dec 2009 09:00:35 +0000</pubDate>
		<dc:creator>Mike Vickers</dc:creator>
				<category><![CDATA[BI (General)]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=3816</guid>
		<description><![CDATA[In this, my final post on the BICC subject (well, for now at least&#8230;), I take a look at the softer side of BI management&#8230; The Balancing Act Put simply, if BI is delivered well it should generate two things&#8230;..firstly, answers [no pun intended] to the questions that the business know need to be answered&#8230;.but [...]]]></description>
			<content:encoded><![CDATA[<p>In this, my final post on the BICC subject (well, for now at least&#8230;), I take a look at the softer side of BI management&#8230;</p>
<p><strong><em>The Balancing Act</em></strong></p>
<p>Put simply, if BI is delivered well it should generate two things&#8230;..firstly, <em>answers</em> [no pun intended] to the questions that the business know need to be answered&#8230;.but then quickly followed by <em>questions</em> (the silent tidal wave of all the things that the business always wanted to do/know/ask, but just never had the capability).  The up side to this is that, given a data model that withstands the test, the toolsets allow end-users to deal with the tidal wave themselves.  The down side is that this runs the risk of opening up some sort of Pandora’s box&#8230;..badly written reports, measures calculated differently by different people, a back catalog of reports written but no longer used, inefficiency, inefficiency, inefficiency&#8230;  The whole things just needs managing.</p>
<p>And this is where the tricky balancing act arises.  For me, it is essential that the BICC takes responsibility for implementing a set of policies, processes and roles and responsibilities, all moulded into a Governance Framework, that lays out the ‘rules of engagement’.  Policy should cover off the main building blocks which will feed into the solution design (such as the security and access, data retention, SLA’s, metric ownership, report accreditation etc.).</p>
<p>The processes required to underpin your BI usage will naturally fall into three main buckets:  those that are internal to the BICC; those that relate to end-users and their relationship with the BICC; and those that relate to how the BICC interacts with IT in the delivery of ongoing change;  Again, the processes can’t really be prescribed but should be drawn up based on what is going to work and be adopted (factoring in BICC scope, resource profiles, organisational culture and so on).  However, the types of area that might be considered may be;  In the first bucket: how data quality is monitored; usage tracking; catalog housekeeping;  In the second bucket: report creation; change request; 1st line support; master data correction;  In the third bucket:  Work Take-On; Operational Schedule Management; 2nd/3rd Line Support; and so on.</p>
<p>Further to this, it is essential that the users of the system know what they are able to do and how they should be doing things&#8230;..in other words, they need to be trained and understand best practice!  And all I’ll say is that training needs to be seen as something more than a one-time session, scheduled at some point just before the system is implemented.  To my mind, the single most important difference between BI solutions and other transactional systems is that BI solutions are organic &#8211; they grow and change shape over time and along with the business.  For this reason, when the BICC thinks about its training strategy, it should be thinking about how it communicates out change and best practice in order to maintain effectiveness.  And whilst we&#8217;re on the subject of communication&#8230;.</p>
<p><strong><em>Getting Passionate</em></strong></p>
<p>I believe that the best BI solutions (again technology set to one side) are where the business has a level of enthusiasm, whether natural or developed, for information and what can be done with it.</p>
<p>Therefore, and in addition to all the other strands of activity that might fall under your BICC, it is incumbent on the BICC to <em>evangelise</em>.  To advocate the usage of information for the benefit of the business.  And to fight the corner for investment in BI (regardless of whether the BICC actually <em>owns</em> a budget or not).  If the BI solution is well defined, designed and delivered, it will start its life being highly thought of by the business.  If it is not then protected, guided and grown, it runs a risk of slowly becoming delinquent and returning a reducing benefit.  At the outset, the BI solution will be used heavily.  If it is not then grown in the right way, it will become less useful, less useable and return a reducing benefit.</p>
<p>So, one of the key roles of the BICC has to be to communicate.  It has to develop methods of communicating out strategy, policy, process, change and best practice.  It has to develop methods of capturing feedback, to understand what works and what doesn’t, what is being used and what isn’t, what new data is needed and what existing data isn’t.  It has to develop methods of communication which ensure that change is defined, designed, delivered and managed efficiently <strong>and</strong> within the timeframes that the business needs.  Finally the BICC needs to develop methods of communicating with the outside world.  With solution vendors, so that the technology roadmaps can continually be compared and re-evaluated.  With 3rd Party BI partners, who might be engaged for implementation, support or ongoing consultancy.  With other, like-minded user organisations, in order to support innovation.  And, if the strategy is such, with suppliers, customers and others in the value chain.</p>
<p>On the bright side, there are usually people scattered around your organisation who are natural ‘data’ people.  It’s always a good idea to engage these people in what you are doing, not least as they may feel the most threatened by your BICC organisation.  They should be seen as potential &#8216;BI champions&#8217; and future advocates of BI within their given departments and may prove a useful vehicle for communicating out the BI message.</p>
<p><strong><em>Back To The Top</em></strong></p>
<p>To summarise, BI technology comes in all sorts of shapes and sizes, but the organisations using them come is infinitely more shapes and sizes.  I would argue that, regardless of shape or size, to make best use of BI technology, a mechanism is required to manage its usage and to coordinate any associated efforts and decisions.  However, it is important to recognise that there is a continuum between a full-blown BICC on one hand, and nothing on the other.  For any <em>BICC-like</em> initiative to be successful, the specifics need to be guided by the size, shape, culture and objectives of your organisation.  If you are competing in a fast moving environment, where information is the life-blood of your business, then your BI is likely to need greater coordination and may, therefore, warrant a heavy BICC investment.  If your organisation see’s reporting as a necessity rather than a differentiator, then the management and coordination will be completely different.</p>
<p>The historical challenges for the acceptance of the ‘BICC ideal’ revolved largely around justification of the costs, especially where the benefits tend to be seen as indirect and somewhat intangible.  However, an ever increasing number of organisations are looking towards their information asset as a source of competitive advantage and, as a result, are seeing the value not just in BI per se, but also in the coordination of its BI efforts.  </p>
<p>So, if you are embarking on a strategic BI initiative, then it’s important that you look beyond the technology and the development lifecycle.  Start thinking in terms of how your BI is going to be used and how it is going to be managed as it evolves over time.  And, if you have implemented BI but seem to be seeing a reducing benefit over time, then it might be worth looking at whether your BI is becoming delinquent.  If it is, it might be time to do something about it.</p>
<p>If you have already implemented, are in the middle of implementing, or planning to implement a BICC-style organisation, I’d love to hear from you to understand some of your experiences.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2009/12/to-bicc-or-not-to-bicc-part-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>To BICC Or Not To BICC (Part 2)</title>
		<link>http://www.rittmanmead.com/2009/12/to-bicc-or-not-to-bicc-part-2/</link>
		<comments>http://www.rittmanmead.com/2009/12/to-bicc-or-not-to-bicc-part-2/#comments</comments>
		<pubDate>Wed, 16 Dec 2009 09:00:14 +0000</pubDate>
		<dc:creator>Mike Vickers</dc:creator>
				<category><![CDATA[BI (General)]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=3810</guid>
		<description><![CDATA[In my last post, I opened up some of my thoughts about how much BICC activity is happening in the real world.  Continuing on from there, I&#8217;m now going to turn my attention to some of the important things that I think need to be addressed in the management of BI (whether under a BICC umbrella or [...]]]></description>
			<content:encoded><![CDATA[<p>In my last post, I opened up some of my thoughts about how much BICC activity is happening in the real world.  Continuing on from there, I&#8217;m now going to turn my attention to some of the important things that I think need to be addressed in the management of BI (whether under a BICC umbrella or not).</p>
<p><strong><em>BI Strategy</em></strong></p>
<p>The starting point for any non-tactical BI initiative should be the development of a BI strategy map.  And it’s worth noting that this statement holds, regardless of the circumstance&#8230;whether a greenfield project, a migration to a new, pre-selected technology stack or where heritage applications are remaining in place.  That’s to say &#8211; the BI Strategy should be technology agnostic, not technology driven.  The form that your strategy takes is likely to be heavily influenced by the type of BI organisation that is planned or already in place.  Where a full-blown BICC is being built, then it is likely that the strategy will be a more formal mission statement.  Where something more subtle is being planned (say, a sub-team within an existing department), then we may be talking about some slide-ware which is used to disseminate the message. </p>
<p>Regardless of form, the BI Strategy should be all about laying down the foundations for:</p>
<p>➡  How information will be utilised by the business to drive competitive advantage</p>
<p>➡  How the business is going to make best use of its information and technology resources</p>
<p>➡  What ‘genetic makeup’ the BICC will have (i.e. will it be business driven, IT driven or, most realistically, blended?)</p>
<p>➡  How the BICC is going to enable the business to leverage maximum value from the investments made in BI</p>
<p>➡  How strategic delivery will be resourced (internally, externally or blended)</p>
<p>Clearly, the BI Strategy must be aligned with the organisations wider strategic objectives and it will often help in communicating the message, if a) a high level vision of the delivery roadmap and b) the BICC’s roles and responsibilities can also be developed at this stage.</p>
<p> <strong><em>Information Delivery</em></strong></p>
<p>The second thing to be recognised is the importance of delivering information in a consistent, coherent and accurate way.  To me, this is central in building the business’ confidence in the BI they use.  Confidence will increase usage which will result in reliance and then dependency.  So, ultimately, ‘information delivery’ will factor heavily in the success or failure of the BICC itself.  Unfortunately, it is also probably the trickiest thing for the BICC to achieve.  And here’s why.</p>
<p>Under the ‘Information Delivery’ umbrella I would group the following activities:  Solution &amp; Data Architecture; Data Quality Management; Data Stewardship; Metadata Management; Presentation &amp; User Experience.  In a nutshell, then, this is where the BICC gets its most technical and so (if predominantly a business lead team) it is where the BICC will be least knowledgeable and least self-sufficient.  This is why the ‘genetic makeup’ of the BICC, as I have described it previously, is so important.  It also seems to be the area of most debate&#8230;.should the BICC be 100% business focussed?  Should it be an extension of the IT department?  Should it be a blend and, if so, what is the balance of power?  The answer will inevitably vary from organisation to organisation (and, again, there is probably no “one size fits all” answer), but my view aligns with the consensus &#8211; that a BICC should be a blended team, but business lead.  As a minimum, senior people to represent solution and data architecture should be incorporated (unless you are lucky enough to have someone who can do both!). Such a composition should help to ‘sell’ the BICC at senior levels, smooth its inception and also ensure that the right architectural decisions are made along the way.  As vendor solutions and product portfolio’s continue to grow, the number of ways of ‘skinning the cat’ can increase exponentially.  A bad ‘technology’ decision could constrain future opportunities, whilst a decision based purely on the commercials will likely result in the same outcome.  Decisions in BI, therefore need to be balanced and require the representation of both business and technology expertise.</p>
<p>Data architecture is essential in the information delivery equation.  A data warehouse with missing data will not be capable of supporting user requirements.  Even where there is built in latency to allow for those unexpected requests, if the data warehouse is poorly modeled your users will be equally constrained and frustrated.  The same is true of your metadata design &#8211; build your semantic layer to support the reporting requirements alone and your users will not be able to easily fulfill their ad-hoc analyses.  DW and Metadata design need to go hand in hand and, if not, will result in an increased overhead for IT and BICC respectively.  Data quality is important in protecting the integrity of the information being consumed and user experience will have a large say in how comfortable the user community feel about actually using the BI.</p>
<p>Putting all this together makes ‘Information Delivery’ certainly the most tangible and probably the most complex aspect of the BICC’s remit.</p>
<p>Next time, I’ll look at some of the less tangible aspects, maintaining some level of control and generating passion for information.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2009/12/to-bicc-or-not-to-bicc-part-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>To BICC Or Not To BICC (Part 1)</title>
		<link>http://www.rittmanmead.com/2009/12/to-bicc-or-not-to-bicc-part-1/</link>
		<comments>http://www.rittmanmead.com/2009/12/to-bicc-or-not-to-bicc-part-1/#comments</comments>
		<pubDate>Fri, 11 Dec 2009 22:38:32 +0000</pubDate>
		<dc:creator>Mike Vickers</dc:creator>
				<category><![CDATA[BI (General)]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/?p=3803</guid>
		<description><![CDATA[If your organisation has been using Business Intelligence in a strategic way for some time, then you will have already seen your BI solution evolve, grow and change from its original incarnation.  You will probably also have an appreciation of the importance of somehow taking ownership of this process of evolution.  Maybe you understood this [...]]]></description>
			<content:encoded><![CDATA[<p>If your organisation has been using Business Intelligence in a strategic way for some time, then you will have already seen your BI solution evolve, grow and change from its original incarnation.  You will probably also have an appreciation of the importance of somehow taking ownership of this process of evolution.  Maybe you understood this before your project implementation or maybe you have learnt through experience.  If you are about to embark on a strategic BI initiative, then it’s worth thinking about your project delivery as the start of the journey, not the end.  And if you think this way, you’ll need to think about how you ensure that you maintain the maximum value of your solution as the needs and direction of your business change over time.  </p>
<p>In this series of blog posts (my first since joining Rittman Mead in October), I’m going to take a look at some of the ways in which BI can be managed effectively, by considering the place of the Business Intelligence Competency Centre (BICC) in the real world.</p>
<p>Much has been written about BICC’s over recent years and, although there are one or two variations on the theme, the general consensus as to the purpose of a BICC and the reasons why enterprises would implement a BICC are pretty well established.  I’m not going to pick over the debate, suffice to say that my crystalisation would be this: </p>
<p>A BICC, or it’s equivalent, should exist to achieve four key fundamentals:</p>
<p>➡  To architect and then champion the enterprise BI Strategy</p>
<p>➡  To ensure that information is delivered to the business consistently, coherently and accurately</p>
<p>➡  To manage the tricky balancing act between end-user self-sufficiency, freedom and flexibility on the one hand and business control and alignment of efforts on the other</p>
<p>➡  To generate and maintain a level of passion for information within the business&#8230;..or (as I appreciate that this may be a tall order!) at least a level of awareness and interest. </p>
<p>Beneath each of these pillars sits a number of key elements and core activities, which I’ll address in subsequent posting’s over the next few days.  But before diving in, there’s an important question that I think needs addressing&#8230;&#8230;</p>
<p><strong><em>So, Where Are All These BICC’s, Anyway?</em></strong></p>
<p>It occurs to me that despite all of the hype and the discussion underpinning BICC theory, the anecdotal evidence suggests that the uptake in and establishment of BICC’s has, at very best, been patchy.  There is certainly no obvious correlation between the uptake of enterprise BI and the implementation of BICC’s&#8230;..the “every good BI project must have a BICC” maxim definitely doesn’t hold true.  And talking to people on the subject, it becomes obvious that a fair number of people are aware of the BICC concept, but know of few who have implemented (or at least attempted to implement) one. </p>
<p>But I wonder whether this view is valid or not.  Based on my experiences, I’d suggest that there is already a significant amount of <em>BICC-like</em> activity happening in organisations&#8230;.it just isn’t being driven under a ‘BICC banner’ and invariably does not adhere strictly to the theory.  In fact, I suspect that a lot of this activity is happening without consciousness of the BICC phrase &#8211; it’s just being done because it makes sense.  Indeed, I have been in organisations where the suggestion of something called a ‘Business Intelligence Competency Centre‘ would have gone down like a lead balloon.  However, talk to people about the principles, imperatives and benefits behind the BICC and I would receive near universal agreement.  And this is principally down to the fact that, for any business serious about using information to drive competitive advantage, the benefits case for a BICC is, in the main, very compelling.</p>
<p> I suspect that the biggest challenge facing most historical BICC initiatives has been in the quantification of its value proposition.  Aside from efficiency gains arising from the delivery of BI systems, the benefits are largely intangible (how you put a figure against a benefits line of “Making Better Decisions” may have to wait for a future blog!).  And when compared to the costs associated with setting up and then running a BICC, in most cases, the equation just couldn’t be balanced.  It invariably depends on the business understanding the value that can be unlocked through its information assets.</p>
<p>Importantly, in my example above, the organisation had a near unstoppable thirst for information and also wore the psychological scars from a history of loosely controlled information (you know the story&#8230;..a proliferation of MS Access, departmental reports which were irreconcilable with each other, let alone back to the core systems, board meeting disputes over whose sales figure was correct etc. etc.)</p>
<p>So, if businesses are undertaking <em>BICC-like</em> activities (whether structured or not, consciously or not, whether called a &#8216;BICC&#8217;, &#8216;MIS Team&#8217;, &#8216;Group Reporting function&#8217; or whatever), what are the things that they are trying to achieve?  And if you are about to embark on a strategic BI initiative, what are the non-technology things that you really should be thinking about? </p>
<p>In my next post, I’ll return to look at the four fundamental areas in more depth, starting with <strong><em>BI Strategy</em></strong> and <strong><em>Information Delivery</em></strong>&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2009/12/to-bicc-or-not-to-bicc-part-1/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>What is Methodology Governance?</title>
		<link>http://www.rittmanmead.com/2009/06/what-is-methodology-governance/</link>
		<comments>http://www.rittmanmead.com/2009/06/what-is-methodology-governance/#comments</comments>
		<pubDate>Mon, 01 Jun 2009 15:03:12 +0000</pubDate>
		<dc:creator>Jennifer Albu</dc:creator>
				<category><![CDATA[BI (General)]]></category>
		<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/2009/06/01/what-is-methodology-governance/</guid>
		<description><![CDATA[After my last blog “Introducing the RittmanMead Delphi Methodology”, one of you posted the comment below. My answer ended up being so long that I decided to turn my response into my next blog. Here is the comment: “What about iterations within the Delphi 18 stages? Do you have a structure for incorporating feedback from [...]]]></description>
			<content:encoded><![CDATA[<p>After my last blog “Introducing the RittmanMead Delphi Methodology”, one of you posted the comment below. My answer ended up being so long that I decided to turn my response into my next blog. Here is the comment:</p>
<p>“What about iterations within the Delphi 18 stages? Do you have a structure for incorporating feedback from later stages into earlier stages i.e. learnings from the build phase being incorporated into and refining the design?”</p>
<p>What Damien was describing is really what governance is all about. Governance is how you go about improving your methodology.</p>
<p>Before you can get to that stage, you have to have a documented methodology and you have to use it. Sadly, not every project and/or IT shop can attest to talking the talk AND walking the walk.</p>
<p align="center"><img src="http://www.rittmanmead.com/wp-content/uploads/2009/06/cavemen2.jpg" alt="" /></p>
<p>Generally speaking, when an IT shop is operating without a well defined methodology, and it completes a project on time, it is due to heroic efforts and long hours on the part of the development team. </p>
<p>In this kind of environment, some of your best developers will get fed up leave you to work in better organized IT shops. Over time your development pool will degrade in quality and you will be less and less likely to bring projects in on time.  </p>
<p>Sometimes, the business gets so fed up they outsource development and that’s not going to fix anything, if the team they outsource to doesn’t use a methodology either.</p>
<p>But let’s assume that you have a methodology and it is standard business process to use it on all your data warehouse projects. </p>
<p>The next step is for you to take measurements. This gets a bit tricky for us IT folks. It’s pretty easy to figure out that a widget factory should count how many widgets are produced/hour/day, but what should a data warehouse team be measured on? </p>
<p>Lines of code written/hour? Mappings deployed/week? </p>
<p>For data quality measurements – perhaps the number of reference data rows rejected/batch?</p>
<p>I was on a project once, where under pressure from the business a decision was made to skip the data profiling phase to “save time”. They ended up spending 18 months in UAT. There’s a new phrase for what happens to your project plan when you build a data warehouse without performing data profiling – they call it “Extract, Load and Explode”. </p>
<p>If the source data is really bad, IMHO it’s better for the company to spend this year’s budget fixing the data at source in the OLTP, and re-call the data warehouse team next year when the data is ready for prime time, the data stewards are empowered and a data governance policy is in place – but I digress. </p>
<p>Here are some industry standard warehouse measurements, but they are at a high level outside of the actual development processes:</p>
<ul>
<li>Do your data warehousing initiatives succeed more often than they fail?</li>
<li>Is there a successful data warehouse implementation within your company?</li>
<li>Is the successful data warehouse sustainable?</li>
<li>Does your company know how much it is spending on data warehousing?</li>
<li>Does your company have centralized IT groups?</li>
<li>Does your company have an enterprise-wide meta data repository that supports the data warehouse and the operational systems?</li>
<li>Has the quality of the data within your data warehouse improved over time?</li>
<li>Do your programmers understand their role in helping the company reach its goals?</li>
</ul>
<p>I think that a good DW development measurement might be hours in UAT per deliverable. If the design was really good, then theoretically the developers would build exactly the right thing, error free, the first time and the code and data should pass UAT and move into deployment quickly. </p>
<p>If the code is kicked back (fails UAT) that’s not a good thing. So the number of mappings that failed UAT or deployment would be good measurements as well. </p>
<p>Another measurement might be project plan accuracy. We all know how incredibly difficult it is to estimate how long it’s going to take to build these data warehouses and data marts. Good methodologies include processes to aid the project leader during the planning phase, but I have to say, having been in this position myself, if I have worked with the team previously, my plan is going to be much more accurate than if I am new to the team. </p>
<p>So now let’s assume you have finished your project, the code is deployed, the batch runs so flawlessly at night that the support &amp; maintenance team is bored, your well trained business users are happily referring to their data mart documentation and you have gathered some meaningful measurements.</p>
<p>Additionally, you find yourself with time and resources available to support the governance phase. </p>
<p>The first step is to gather the lessons learned from the project. You need to document what went well and what didn&#8217;t and get everyone to agree. Next the team should brain storm what changes should be made to the methodology to improve things for next time. </p>
<p>These changes to the methodology have to be well documented and introduced carefully so that everyone understands the differences between the old way and the new way and why it changed. Sometimes the new way involves new tasks that aren’t much fun like going to meetings to gain consensus, documentation, change control procedures, coding standards, loosing access to production, business prioritization of tasks, issue log maintenance and time tracking. So your team has to buy into them. </p>
<p>Finally, you have to wait for the next project to complete with your changes in place and compare that project’s measurements to the first project’s to see if your methodology changes were effective.</p>
<p>Whew! Governance is hard. . . </p>
<p>That’s why so few IT shops are at that level of maturity. It is estimated that less than 2% of IT projects include a governance or methodology improvement phase. </p>
<p>To improve a business process such as widget production you can change the process and see results in hours or days. But changing IT methodology is harder. Each project takes longer, the team members are always changing and we never build the same thing twice. </p>
<p>Ideally, to be sure our methodology improvements worked, we should put a data warehouse team on a deserted island (well, somewhere isolated with really good connectivity) and make the team build a data mart. Then you’d have to improve the methodology, and have a similar team come along and build the exact same data mart. When they were finished you’d have to compare the measurement results. </p>
<p>That would be the only way to ensure that the process had indeed been improved, but even then I’m sure you’re thinking of how many parameters wouldn’t have been properly controlled the second time around.</p>
<p>Difficulties in measuring IT development efforts notwithstanding, for about 40 years now, businesses have wanted a way to determine if we IT folks are any good and figure out how to make us faster and cheaper. (I understand that this desire developed a few weeks after the first computer was installed.)</p>
<p>Back in 1989 Watts Humphrey first wrote about measuring how well IT projects were run in order to be able to assess the ability of IT government contractors. He called it the Process Maturity Framework. Then in 1995 Mark Paulk, Charles Weber, Bill Curtis, and Mary Beth Chrissis published their book on the Capability Maturity Model (CMM) which is based upon Humphrey’s work.</p>
<p>The book outlines exactly how to determine what level of maturity your IT organization is at, and how to improve things and move IT up from one level to the next.</p>
<p>There are five levels of maturity defined:</p>
<ol>
<li>Initial (chaotic, ad hoc, heroic) the starting point for use of a new process.</li>
<li>Repeatable (project management, process discipline) the process is used repeatedly.</li>
<li>Defined (institutionalized) the process is defined/confirmed as a standard business process.</li>
<li>Managed (quantified) process management and measurement takes place.</li>
<li>Optimizing (process improvement) process management includes deliberate process optimization/improvement.</li>
</ol>
<p>Jennifer’s easier to remember CMM definitions:</p>
<ol>
<li>Dilbert cartoons are posted everywhere – (keep your CV/resume dusted off)</li>
<li>Talking the talk – (the methodology is written down)</li>
<li>Everyone is walking the walk – (using it)</li>
<li>Managers are measuring how far you can walk</li>
<li>Everyone’s getting predictably better at walking</li>
</ol>
<p>The good news is that if you are at level 3, (everybody is walking the walk) you are working for a world class IT shop!</p>
<p align="center"><img src="http://www.rittmanmead.com/wp-content/uploads/2009/05/cmm.gif" alt="" /></p>
<p>Figure 1 &#8211; Applying CMM Levels to Data Warehousing, David Marco</p>
<p>So to answer Damien’s comment, yes we have a structure for incorporating feedback from later stages into earlier stages. The last tasks on our project plan template are all about gathering the lessons learned and modifying the methodology for next time.</p>
<p>Sadly, so far I have always been asked to remove these tasks from my final signed off versions of the project plan in order to save money.  – irony</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2009/06/what-is-methodology-governance/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Introducing the Rittman Mead &quot;Delphi&quot; Methodology</title>
		<link>http://www.rittmanmead.com/2009/04/introducing-the-rittman-mead-delphi-methodology/</link>
		<comments>http://www.rittmanmead.com/2009/04/introducing-the-rittman-mead-delphi-methodology/#comments</comments>
		<pubDate>Sun, 26 Apr 2009 10:13:43 +0000</pubDate>
		<dc:creator>Jennifer Albu</dc:creator>
				<category><![CDATA[BI (General)]]></category>
		<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Methodology]]></category>
		<category><![CDATA[BI Methodology]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/2009/04/26/introducing-the-rittman-mead-delphi-methodology/</guid>
		<description><![CDATA[I&#8217;d like to start off my first blog by introducing myself. I&#8217;ve been working in the IT field for 25 years, the last 12 in data warehousing. I&#8217;ve analyzed, designed and coded all aspects from OLTP extracts and ETL mappings to stars schemas and OLAP reports. Looking back, I guess I&#8217;ve always had an interest [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;d like to start off my first blog by introducing myself. I&#8217;ve been working in the IT field for 25 years, the last 12 in data warehousing. I&#8217;ve analyzed, designed and coded all aspects from OLTP extracts and ETL mappings to stars schemas and OLAP reports. Looking back, I guess I&#8217;ve always had an interest in methodology and have studied both IT and business approaches to &#8220;Doing it Better&#8221;.  My favourite methodology so far is Six Sigma, which can be applied equally well to IT projects as well as to improving business processes. I like it best, because it emphasizes measurable improvements instead of management just coming away with a warm and fuzzy feeling that things are “better” now.</p>
<p>I was lucky enough to join Rittman Mead last autumn as a Principal Consultant, and one of my responsibilities is to refine our approach and methodology. I&#8217;ll be writing about BI and data warehousing methodologies over the next few months and I&#8217;d like to talk about Delphi, a methodology I&#8217;ve championed within RMC for those projects when we take responsibility for the delivery, or those when we support members of the client team who are responsible for project leadership.</p>
<p>A solid methodology provides a standardized step-by-step plan for approaching projects consistently, clearly defined roles and responsibilities and a framework that enables client and project leadership to maintain control. But at the same time it shouldn&#8217;t be so detailed and overly encumbered with so many checks and balances that it slows delivery down or even brings it to a stand still.</p>
<p>I look at methodology as a way of reducing the frustration that everyone feels during IT projects:</p>
<p>•    The business users hate it when we build something they didn&#8217;t ask for<br />
•    The developers hate it when they have to re-do things over and over again because the instruction they got weren&#8217;t clear or detailed enough<br />
•    The project leadership hates it when the business keep asking for more and more “new features” to be thrown in at the last minute<br />
•    The business management hate wondering what we are up to and hate weeks or months of no news<br />
•    The business sponsors don&#8217;t like worrying that they may have wasted money<br />
•    The support team hate having business critical applications thrown at them with little or no documentation and inadequate training</p>
<p>Many of you will have seen the famous “Tire and Tree” cartoon (apparently it has been in circulation since the mid ‘60s!), but it’s well worth including here because it so well exemplifies the dangers of poor communication – which IMHO characterizes all verbal communication:</p>
<p><img src="http://www.rittmanmead.com/wp-content/uploads/2009/04/tireswing1.jpg" alt="" /></p>
<p>Often when we work on contract engagements we are asked to use our client&#8217;s methodology. In this case, I always look for ideas and suggestions that I can recommend to them to improve their methodology and this advice has always been well received.</p>
<p>Other times the client either has no methodology, or does not practice using one effectively. In these projects it is really important that you try to at least apply a bare bones minimum set of methodology deliverables, not only for them but also for your own protection.</p>
<p>If you can at least minute important design meetings and send confirmation emails of all design changes as you go along, at least you&#8217;ll have something to fall back on if all £$%&amp; breaks loose. If I had to choose one deliverable only, it would be the functional or design specification. This document not only encourages discussion about what is going to be developed, but by its very nature it helps to set scope.</p>
<p>If I had to choose one communication device it would be a weekly status update email that copies in everyone who might be wondering what you are up to and why it’s taking so long.</p>
<p>The RMC Delphi Methodology enables project leaders to pick and choose what deliverables and level of status reporting is applicable for each engagement. It&#8217;s important to tailor the scope of deliverables and communication for each project.</p>
<p>To aid in a smooth progression through project phases, the Delphi Methodology incorporates exit reviews at the end of each phase. Prior to closing the phase on the project plan, the project manager is required to complete an exit review to ensure all required steps have been taken and that open items are addressed. The review includes development of plans for resolving open issues, as well as a review of project schedule and budget, including managerial approval of changes.</p>
<p>The Delphi Methodology defines standardized responsibilities of logical roles in the data warehouse development for both the Rittman Mead Consulting team and the client team. There are 18 phases within Delphi, which can be fine tuned and/or cut down in scope to fit even the smallest project. The 18 phases are:</p>
<p>1. Project Startup<br />
2. BI requirements definition<br />
3. Data analysis requirements definition<br />
4. Data architecture<br />
5. ETL architecture<br />
6. BI architecture<br />
7. Infrastructure architecture<br />
8. Design data model<br />
9. Design ETL<br />
10. Design BI<br />
11. Design Infrastructure &amp; proof of concept<br />
12. Deployment &amp; Testing Preparation<br />
13. Build ETL<br />
14. Build BI<br />
15. System Test<br />
16. User Acceptance Test<br />
17. Train and deploy<br />
18. Maintenance &amp; Support</p>
<p>It&#8217;s important to go through each of the phases, even if vastly reduced in scope, to ensure that you deliver a quality product every time. Whether you are working on a two-year enterprise-wide data warehouse or just adding a new level of aggregation to an existing data mart.</p>
<p>Each phase of the methodology contains a list of deliverables and working documents with links to document templates, such as project plans, statements of work, issues logs, and checklists.</p>
<p>Using templates ensures quality documentation across projects and faster, easier development of documentation based on quality documents from past successful projects. As we complete each project, we try to take the time to update our templates with any improvements we have discovered. Although, I admit this non-funded phase of project work is the most likely to be neglected.</p>
<p>In conclusion, I like to sum up by saying that the key to a successful data warehouse project is a comprehensive methodology that applies best practices and proven experience to guide a data warehouse project team from project launch through to deployment.</p>
<p>I&#8217;d like to also say to all the die-hard brilliant developers out there who&#8217;d rather walk over hot coals than spend a week doing documentation, try to think of methodology as way to ensure that no one gets frustrated – especially you!</p>
<p>I intend to come back to this over the next few weeks and months and add detail on to the 18 Phases I’ve listed. I’m also keen to get feedback from you as your comments will help me to improve the Delphi Methodology.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2009/04/introducing-the-rittman-mead-delphi-methodology/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>End-to-end data quality</title>
		<link>http://www.rittmanmead.com/2008/10/end-to-end-data-quality/</link>
		<comments>http://www.rittmanmead.com/2008/10/end-to-end-data-quality/#comments</comments>
		<pubDate>Sat, 25 Oct 2008 10:56:14 +0000</pubDate>
		<dc:creator>Peter Scott</dc:creator>
				<category><![CDATA[BI (General)]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/</guid>
		<description><![CDATA[One of our customers is about to embark on a significant BI project; but being in the &#8220;public sector&#8221; they have to (by EU law) publish tender documents so that qualified suppliers throughout the EU can bid to do the work. This means they have a gap of almost a year before the, yet to [...]]]></description>
			<content:encoded><![CDATA[<p>One of our customers is about to embark on a significant BI project; but being in the &#8220;public sector&#8221; they have to (by EU law) publish tender documents so that qualified suppliers throughout the EU can bid to do the work. This means they have a gap of almost a year before the, yet to be selected, BI infrastructure can be implemented and work on building the solution can start.</p>
<p>In the interim, the customer can work on data quality; they know what they need to report on (it&#8217;s in the project mandate!) and they know the sources of information (their operational systems) so they can start to verify that all of the required facts can be found in the source systems and more importantly look at the data content and assess &#8220;fitness for purpose&#8221;. If data defects are found then it may be possible to get them fixed before the serious construction of the ETL layer starts. Besides, the knowledge of source and target gives a good head start in the specification of ETL interfaces.</p>
<p>One particular issue they might meet, and one that is sadly far too common across many business sectors, is the use of operational systems that do not enforce data integrity. For whatever reasons there is just too much freedom in data entry and although it may not affect the operational system much it really can cause problems when you try to aggregate information on the BI system.</p>
<p>But how do we deal with this? Recently I joined in on a thread on one of LinkedIn BI groups where it was proposed that a &#8220;receive garbage, store garbage strategy was adopted&#8221; &#8211; in my opinion this might be OK for a mature BI system where users can understand that the reporting accurately reflects the source, but for a new venture into BI? To me, this seems to be too much a risk; it might be that the new BI users do not have sufficient exposure to the source systems to realise that the data is at fault on the source. We could prevent data that fails a quality threshold from loading on the BI system, but then we would show <em>incomplete</em> results which although correctly aggregated are misleading because of omission; at the end of the day load policy is a business choice. If we go with the &#8220;reject poor data&#8221; route we should seriously  think about providing a data quality dashboard on the reporting system to indicate the numbers of records that failed to be loaded and drill-down to the reasons why they failed.</p>
<p>So what do we do with data that fails the quality standard? Ideally, we should get it fixed at source. Auto-fixing on load is possible, but then we need to think about data governance and the possible &#8216;trust&#8217; problems of the data being not aligned with the source. Maybe you could &#8216;standardise&#8217; country names and other columns on loading; I&#8217;ve seen systems with &#8216;USA&#8217;, &#8216;U.S.A&#8217;, &#8216;U S A&#8217;, &#8216;US of A&#8217;, &#8216;America&#8217;,  and &#8216;US&#8217; in the country data feed and that&#8217;s before we get to the mis-keying of &#8216;United&#8217; to get &#8216;Untied&#8217;!  But maybe that sort of improvement in quality should also be available to operational systems users.</p>
<p>For this customer, I have suggested that they construct a source to BI target matrix and include some basic traffic light measures on the source data:</p>
<ul>
<li>How good is it?</li>
<li>What sort of errors are present; missing items, typographical errors, missing or incorrect parents, inconsistent use of names, even data entered in the wrong fields.</li>
<li>How important is it to be correct in the BI system; for example street address can not be aggregated in reporting and we may not be going to use BI to create mailing lists, but postal code (or a sub string of it) can be used to aggregate people by location areas.</li>
<li>How important is it to be correct on the operational source &#8211; do we need to apply the corrections at source to improve the operational use of the system</li>
</ul>
<p>But this type of quality review may not tackle the data problem that is probably hardest to deal with what is a correct fact? How do I know if house value of £20,000 is reasonable (it  could be in a shared ownership scheme) or £2,0000,000 or £20,000,000? We could set a validation range, but where is there that point that one penny more is obviously wrong, but the current value OK?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rittmanmead.com/2008/10/end-to-end-data-quality/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>

