<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: End-to-end data quality</title>
	<atom:link href="http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/</link>
	<description>Delivered Intelligence</description>
	<lastBuildDate>Wed, 10 Mar 2010 13:01:07 -0500</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Emil</title>
		<link>http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/comment-page-1/#comment-6601</link>
		<dc:creator>Emil</dc:creator>
		<pubDate>Wed, 29 Oct 2008 06:57:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/#comment-6601</guid>
		<description>There are two things that are always contradict - from one side the customer to be happy with nice and correct reports and from the other side those correct reports to come from dirty data.  The goal here is to find the intersection point. How do I do this. Some time I use the IFRS concept of materiality.. is the error material (Does the error lead to taking wrong decisions).  If the answer is yes then the question is how did you corrected it up to now , the same way will be with the new system or some automation may be proposed to ease this process(e.g. OBIEE writeback option –see my previous post). If the answer is no then they can live happy with this error. But supplying erroneous reports even the customer accepted it is not good from professional point of view .Also it is a good practice to request the last IT Audit reports for the source systems and if somewhere audit controls findings and conclusions are mentioned this is a good starting point to discuss this issue. In all cases a very well prepared UAT (user acceptance criteria ) must be signed off even before project kicks off.</description>
		<content:encoded><![CDATA[<p>There are two things that are always contradict &#8211; from one side the customer to be happy with nice and correct reports and from the other side those correct reports to come from dirty data.  The goal here is to find the intersection point. How do I do this. Some time I use the IFRS concept of materiality.. is the error material (Does the error lead to taking wrong decisions).  If the answer is yes then the question is how did you corrected it up to now , the same way will be with the new system or some automation may be proposed to ease this process(e.g. OBIEE writeback option –see my previous post). If the answer is no then they can live happy with this error. But supplying erroneous reports even the customer accepted it is not good from professional point of view .Also it is a good practice to request the last IT Audit reports for the source systems and if somewhere audit controls findings and conclusions are mentioned this is a good starting point to discuss this issue. In all cases a very well prepared UAT (user acceptance criteria ) must be signed off even before project kicks off.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Scott</title>
		<link>http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/comment-page-1/#comment-6600</link>
		<dc:creator>Peter Scott</dc:creator>
		<pubDate>Tue, 28 Oct 2008 15:09:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/#comment-6600</guid>
		<description>@Tim - broadly the same as my suggestion to the customer.
@Dave - I understand where you are coming from, and for some organizational cultures you can do this (it is the best way!), but in organisations that currently view data through spreadsheets backed by an extensive suite of macros to &#039;fix&#039; the data it can be a trickier prospect to sell reports of bad data... data users are sometimes protected from problems for too long</description>
		<content:encoded><![CDATA[<p>@Tim &#8211; broadly the same as my suggestion to the customer.<br />
@Dave &#8211; I understand where you are coming from, and for some organizational cultures you can do this (it is the best way!), but in organisations that currently view data through spreadsheets backed by an extensive suite of macros to &#8216;fix&#8217; the data it can be a trickier prospect to sell reports of bad data&#8230; data users are sometimes protected from problems for too long</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim Berry</title>
		<link>http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/comment-page-1/#comment-6599</link>
		<dc:creator>Tim Berry</dc:creator>
		<pubDate>Tue, 28 Oct 2008 14:35:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/#comment-6599</guid>
		<description>Looking at the time they have they should instigate a central store and report MI fashion as to the spread of the content of the systems against what is valid, invalid and grey (potential for rule based cleanup).
Usually there are placeholders in warehouses for non conformant data but these are usually filled larger than expected and the clean up investigation left until the warehouse is populated.
I think that it is better to assess the source data prior to loading any target and that asessment needs to be planned so that volumes and actions can be considered and reassessed against one another.
Therefore a central store that takes comparison and rule based checks is the way to go. Usually there isn&#039;t enough time but as these systems save more time after the extent of corruption is ascertained they are in the good position to instigate the action outside of the target system.</description>
		<content:encoded><![CDATA[<p>Looking at the time they have they should instigate a central store and report MI fashion as to the spread of the content of the systems against what is valid, invalid and grey (potential for rule based cleanup).<br />
Usually there are placeholders in warehouses for non conformant data but these are usually filled larger than expected and the clean up investigation left until the warehouse is populated.<br />
I think that it is better to assess the source data prior to loading any target and that asessment needs to be planned so that volumes and actions can be considered and reassessed against one another.<br />
Therefore a central store that takes comparison and rule based checks is the way to go. Usually there isn&#8217;t enough time but as these systems save more time after the extent of corruption is ascertained they are in the good position to instigate the action outside of the target system.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Katz</title>
		<link>http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/comment-page-1/#comment-6595</link>
		<dc:creator>Dave Katz</dc:creator>
		<pubDate>Mon, 27 Oct 2008 21:41:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/#comment-6595</guid>
		<description>If the people who want the reports don&#039;t know much about the source, then you have to show them the &#039;garbage out&#039;. I have found that showing the business users a few examples of reports based on dirty data tends to make the problem more concrete for them.  Otherwise, discussions of data quality can seem very abstract.</description>
		<content:encoded><![CDATA[<p>If the people who want the reports don&#8217;t know much about the source, then you have to show them the &#8216;garbage out&#8217;. I have found that showing the business users a few examples of reports based on dirty data tends to make the problem more concrete for them.  Otherwise, discussions of data quality can seem very abstract.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Scott</title>
		<link>http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/comment-page-1/#comment-6592</link>
		<dc:creator>Peter Scott</dc:creator>
		<pubDate>Mon, 27 Oct 2008 16:38:46 +0000</pubDate>
		<guid isPermaLink="false">http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/#comment-6592</guid>
		<description>@Emil - no need to apologise!

You are right, there is no 100% solution and certainly not a one-size-fits-all one.
The idea of write back is interesting, I can see it working for some cases. In the example I was thinking of we should really get the data fixed up in the sources - for various reasons of data governance (and third parties keeping their intellectual property) we do not have access direct (SQL) access to the application data structure structures. The other problem is that people who want the reports do not know much about the source..  I guess this is quite common where business users are interested in highly aggregated information and not raw fact

The best we can do is report back the problems and hope they get fixed.</description>
		<content:encoded><![CDATA[<p>@Emil &#8211; no need to apologise!</p>
<p>You are right, there is no 100% solution and certainly not a one-size-fits-all one.<br />
The idea of write back is interesting, I can see it working for some cases. In the example I was thinking of we should really get the data fixed up in the sources &#8211; for various reasons of data governance (and third parties keeping their intellectual property) we do not have access direct (SQL) access to the application data structure structures. The other problem is that people who want the reports do not know much about the source..  I guess this is quite common where business users are interested in highly aggregated information and not raw fact</p>
<p>The best we can do is report back the problems and hope they get fixed.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Emil</title>
		<link>http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/comment-page-1/#comment-6591</link>
		<dc:creator>Emil</dc:creator>
		<pubDate>Mon, 27 Oct 2008 14:24:36 +0000</pubDate>
		<guid isPermaLink="false">http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/#comment-6591</guid>
		<description>Appologies Peter...

Just missed to look at the author of the post..</description>
		<content:encoded><![CDATA[<p>Appologies Peter&#8230;</p>
<p>Just missed to look at the author of the post..</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Emil</title>
		<link>http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/comment-page-1/#comment-6590</link>
		<dc:creator>Emil</dc:creator>
		<pubDate>Mon, 27 Oct 2008 12:17:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/#comment-6590</guid>
		<description>Hi Mark,

From my experience in dealing with data quality matters there is no 100% solution. But in my projects what I have done I will gladly share it with you. Extending the Ralph Kimball idea  to include a control dimention , I have come to a solution to log all errors encounterred in a dimentional objects ( facts and respective dimensions) then showing those error reports to the BI administartor or to respective business users responsible to check and correct errors. The error reports are not just reports but forms like where using writeback option of BI , procedures can be run to correct the errors and even to re-run the etl again. Automation is not possible but giving a nice user interface where to correct some/most of the errors is acceptable to most end-users. This is the practical solution of OBIEE , I have implemented ...of course all the analysis must be comleted and there lots of nice data cleansing methodolgies but at the end of the day , smth simple and understandable must be presented to the end-user and in most of the cases I had , this approach will be accepted...

Best ragards,
Emil.

PS Hope to attend  your training days  once again ... if closer to EEC</description>
		<content:encoded><![CDATA[<p>Hi Mark,</p>
<p>From my experience in dealing with data quality matters there is no 100% solution. But in my projects what I have done I will gladly share it with you. Extending the Ralph Kimball idea  to include a control dimention , I have come to a solution to log all errors encounterred in a dimentional objects ( facts and respective dimensions) then showing those error reports to the BI administartor or to respective business users responsible to check and correct errors. The error reports are not just reports but forms like where using writeback option of BI , procedures can be run to correct the errors and even to re-run the etl again. Automation is not possible but giving a nice user interface where to correct some/most of the errors is acceptable to most end-users. This is the practical solution of OBIEE , I have implemented &#8230;of course all the analysis must be comleted and there lots of nice data cleansing methodolgies but at the end of the day , smth simple and understandable must be presented to the end-user and in most of the cases I had , this approach will be accepted&#8230;</p>
<p>Best ragards,<br />
Emil.</p>
<p>PS Hope to attend  your training days  once again &#8230; if closer to EEC</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: illiyaz</title>
		<link>http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/comment-page-1/#comment-6578</link>
		<dc:creator>illiyaz</dc:creator>
		<pubDate>Sat, 25 Oct 2008 20:43:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.rittmanmead.com/2008/10/25/end-to-end-data-quality/#comment-6578</guid>
		<description>Data Quality has been one of the oldest problems and i think if the users are well versed with the source system, they should be able to correct the data on the fly(using write back features) if it is possible.Also if a log of corrections is maintained(may be in a table), the inexperienced users can use the logtable to some extent in ascertaining the right values.</description>
		<content:encoded><![CDATA[<p>Data Quality has been one of the oldest problems and i think if the users are well versed with the source system, they should be able to correct the data on the fly(using write back features) if it is possible.Also if a log of corrections is maintained(may be in a table), the inexperienced users can use the logtable to some extent in ascertaining the right values.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
