In my last article I described the benefits for businesses of employing a data governance framework. We looked at how data governance improves the quality of the data you have; gets it to the people who can gain most value from it; and so drives better outcomes from using it.
These benefits can be so profound that data governance should be a matter of ‘the sooner the better’. But in reality, most businesses won’t be able to move immediately. Smaller companies will need persuading before they redirect money and people away from core activities. Larger businesses will typically have layers of decision-making to get through before green lighting any major new project.
But fast progress is possible. Data governance does not need to begin life fully fledged. There are some quick wins that most organisations should be able to implement in six months without the need for significant additional budget or board level agreement. In fact, there are advantages to starting small. It allows you to test and rethink plans, and collect evidence of positive outcomes, so your case is more compelling when the business is ready to move forward properly.
Here are three modest but significant steps you can take now to build the blocks for a data governance programme.
1) Decide your domains
Companies generate and collect an overwhelming array of data. Your first job is to bring structure to it all by separating data into different domains. There are a variety of ways to define each domain. At a high level, you could separate data into business functions, such as finance, legal and HR. You could go another level down; your finance domain could include subsets for assets, expenses and cash, while your HR domain could be split into employee personal records, compensation and performance.
At this stage, what’s key is that the data domains are recognisable and relevant to your business; and that the data of a domain or subdomain is unique enough to warrant it being separate. There’s a balance to be had. The more domains you choose, the easier it should be to locate specific data, but the more complexity you will be taking on.
Also bear in mind that each domain should have an owner that is expert in the data residing there. Context is key here. Without functional knowledge, data can be opaque or misinterpreted, resulting in classification errors. For example, finance data is likely to include different tax codes relating to VAT, payroll and corporation tax, possibly across multiple jurisdictions. Classifying these codes into the right domains requires knowledge of current tax systems and rules.
2) Meta your data
Next job is to decide how you’re going to represent that data. As with domains, there are different levels of detail you can apply to recording data. Information about what software or system it is stored in, and when it was last used or amended, are helpful for providing context and to make the data more searchable. We call this metadata. The more you have, the more helpful it can be. But again, think about the level of complexity you are willing to handle, especially as your data governance programme expands.
As a basic start, three metadata fields are important.
- Business: data that provides definitions of business terms and explains how the data is used in the business.
- Technical: provides information about storage, such as the name of the database it is part of, the column/row it appears in and any indexes referencing it; and format, such as the file type and if it is structured or unstructured data.
- Operational: includes information about how and when data was created or transformed, such as timestamps, location, job execution logs, and data owners.
3) Implement a data catalogue or dictionary
Now you need somewhere to store all this information. We call this a data catalogue, or sometimes a data dictionary. At its most simple, this could be done on a shared spreadsheet. But as you add more data, a spreadsheet can become unwieldy, both in terms of storage capacity and searchability. What’s more, when you inevitably outgrow your spreadsheet, you will need to migrate all the data to a new solution, which can be onerous and prone to error.
It is better to begin as you mean to go on and start with a specialist data catalogue product. You won’t need to break the bank, and some (e.g Oracle Data Catalog) are free when you subscribe to the vendor’s cloud database, BI or infrastructure service.
But ‘free’ shouldn’t be your only criteria. Bear in mind that the data catalogue you start with should ideally serve you well into the future, so consider the capabilities you will eventually need. Prioritise discovery features; for example, a ‘good’ search function can return data by who’s used it, provenance, quality, and context, to name a few. Lineage could also be something you want to capture. Understanding where data originated from can be key in knowing how to use it.
Also think about how data is ingested to the system. The more automation, the better because manual tasks tend to get forgotten or overlooked. However, automation does need maintaining, so something like existing API or web service may be preferable to building complex integrations.
Though you’re looking for quick progress in this initial stage of your data governance programme, it’s still important to shop around for the best fit solution. Devise an evaluation exercise that measures your shortlisted solutions against your key decision criteria. Get your hands on the product, and test it with samples of your data. And think beyond technical capabilities, to include the service levels of the vendor. You must be confident in the vendor’s ability to provide a long term partnership, so also appraise the quality of its customer support and its innovation roadmap.
Finally, be sure to collect learnings and proof points as you go, and keep stakeholders informed. Consider what resources you’ll need to extend the project in six months, who you’ll need to persuade, and what they'll want to know.
In my next article we'll take a look at how to build momentum and secure investment to scale and mature your governance plan.