In his recent mini-series, Jon wrote about the importance of effective data governance, how to go about initiating a data governance framework and building a data culture. One thing that is certain about data governance and, dare I suggest, one major factor in why it is often overlooked or actively avoided…is that it is complex. By necessity, it's reach is broad and therefore is bound to impact many people in all areas of a business. These type of things are traditionally difficult to do well...but that should not be a reason not to tackle it.
In my mind, the key thing to address at the outset is something Jon touched on in the first post…in his first sentence, no less! And that is value. The only real way we can justify any investment in data governance is through a discussion (and agreement) on the value of data.
Sounds simple, right? Having worked in the industry for a few decades now, I have seen the position of data and analytics grow from being a nice to have, to a necessity, to a source of competitive advantage. Now, data is increasingly seen as having the potential for strategic differentiation and market disruption…in fact there are more and more examples where data pretty much is the business.
So, I'd be surprised if there wasn't collective agreement that your data has value beyond supporting line of business operations. But is there agreement on, or even any understanding of how valuable the data is? Before we go down that rabbit hole, let’s hold on to what we have for a moment: data is valuable.
And if your data is valuable, then it should be treated like an asset.
And if it is treated like an asset, then your data needs management.
Let’s take a moment to just think about how we might manage tangible assets, say, a fleet of company cars. As a minimum, you would want to record information about each asset (make, model, fuel type, engine size, registration date etc. etc.) and have some means of identification (registration plate number or VIN number). You would know the value of the car and would have a mechanism for depreciating the value over its lifetime so that at any point in the future, you would be able to calculate your assets current worth. You would probably also want to know which employee is driving which car and would want to record information about its usage. When appropriate, you would use this usage information to service the cars to ensure they are kept in working condition and fit for purpose. If a warning light appeared on the dashboard, you would expect the driver to call it in, so that the car can be checked and any defects fixed. You would do these things, at least in part, to ensure the value of the asset is protected.
Whilst data is intangible and so is not exactly analogous to my example, the same principles still largely hold true. So let’s think through how this translates into the world of Data Management.
Firstly, we need a unified definition and record of significant data entities that can be used and understood across the enterprise. This is master data management and involves processes and systems through which core data entities are defined and managed, including consistent key identifiers, important attributes and expected value ranges. In an integrated data architecture, master data management is important in ensuring uniformity of data across the enterprise which, in turn, allows for data to be worked with more efficiently and accurately.
Secondly, we need to be able to catalogue the data we hold through its lifecycle and record important information about it, such as the type, source, domain, last load date/time and more. We may also find it useful to record lineage details about where, how and when the data is used. This is metadata management. Part of metadata management should also be about finding a way to determine and record the value of the data. [Sadly, this is not as simple of knowing the cost of a car and depreciating it over 3 years.] Data will have a natural lifetime, but sometimes its value will appreciate as a function of time and/or volume (for example, a single event message is almost worthless, but when used in conjunction with millions of other event messages it could be invaluable).
Next, we must take an active interest in data quality. In the same way that our cars have warning lights that indicate faults that need fixing, mechanisms should be in place to monitor data quality. Bad data should be quarantined until it can be addressed and processes should be in place to review and fix data quality issues when they arise. In an ideal world, much of the monitoring and quarantining would be automated but this is not essential and business process and human intervention can achieve the same outcomes.
Finally, all data assets should have an owner. This is the person that takes responsibility for the data, its definition and for making decisions regarding its access and usage. The data owner should also have a perspective on its security, the risk the data presents to the organisation through loss or theft and any ethical implications presented by the data. Data ownership is critical to successful data management, without it most data governance initiatives will be destined to fail.
In our view of the world, these four pillars: master data management; metadata management; data quality and data ownership are all essential to good data management. How they are addressed and the roles people, process and technology play in supporting data management activities becomes a question of maturity (something to be picked up in due course).
So, if we can agree that data has value (beyond helping to fulfil line of business operations), then we can hopefully agree that data management is worthwhile.
However, it is not everything…
Thinking back to our company cars, it is insufficient just to manage the fleet. We need to have policies in place that underpin the company car scheme. We need to ensure that employees have the necessary skills and training to use the asset at their disposal (i.e. they hold a valid driving license). We will probably also have rules that define how and what the car can be used for (business vs personal usage) and rules around what constitutes proper or improper usage (e.g. no driving after a drink).
In a nutshell, it is one thing to manage the asset, but it is something else to translate this into a service enabling the asset to add value to its users.
Again, this is analogous to our data assets. Data Management activity allows us to control the data we have, but this is only one aspect of the overall Data Governance framework that allows our data to start delivering value.