Friday, October 25, 2019

Data Governance vs. Information Governance

A lot of folks have heard the terms "data governance" and "information governance", and the two terms are often used interchangeably. This, though, is a mistake as there are significant differences between the two initiatives. In this post, we will explore those differences and explain why it is important for companies to implement both.

To get started, it makes sense to spell out a few definitions. In the world of data, there is a progression from data to information to knowledge, and it is important to understand each of these.

Data is essentially just the raw data available across various sources like databases, spreadsheets, etc. For transactional purposes, it is obviously necessary as it underlies every platform out there whether it be an ERP, CRM, WMS, etc. Without the system on top of it, though, the data is not useful as you typically cannot learn anything from the data in its raw state. Data can flow among multiple systems and can be aggregated into a single location like a data lake.

Information is the cleansed and organized form of the aforementioned data. This is a very important step in the process that is often minimized by people looking in from the outside of a Data and Analytics implementation. You cannot go directly from data to knowledge without a significant amount of work spent turning the data into information. You can expect that 50% or more of the effort for the overall program will go into producing and organizing this information.

Knowledge is the set of insights gained from analyzing the information whether that be descriptive analytics where you are simply looking at what happened in the past to help you plan for the future or a fully executed neural network that is churning through huge volumes of data to prescribe actions to take. The key here is that these insights need to be actionable. Simply knowing something is not particularly useful if the knowledge cannot be acted upon to optimize or automate a process, create a new revenue channel, change a business model, etc.

This leads to my first argument for having separate data and information governance initiatives.

1) Data and information are very different from each other, and this difference is important to highlight and fully understand as it requires significant effort to move from data to information. Treating them the same tends to lessen the importance of the information in the eyes of the key stakeholders across the company while giving the very false impression that simply collecting data will lead quickly to knowledge.

Now that we understand the differences between data and information, we can take a look at who is responsible for each. Every company is different, but, generally, data governance is primarily the responsibility of IT.  IT manages the platforms and databases, creates the integrations pipelines to move data around, pulls data into analytics platforms, etc. We will hold off discussing the details of implementing a data governance initiative and save it for a future post. At a high level, though, data governance is about privacy, security, risk management, compliance, and auditing. The work involved in such an effort mainly falls into the realm of technical solutions put together by IT. Of course, there is collaboration with the business (e.g. Finance) particularly around compliance and audit, but even in those cases technical solutions are typically required.

Note:  Certain businesses like health care, banking, pharmaceuticals, etc. that tend to be very heavily driven by risk and compliance are likely to have entire teams dedicated to these areas. This might mean that a team outside of IT is actually driving those aspects of the project, but it is still appropriate to consider them part of the data governance initiative because the focus is on the data itself and not the information and knowledge that can be derived from the data.

Just about every company out there knows about the aforementioned aspects of data governance. They don't want people within the company accessing data they should not. They don't want data leaving the company and falling into the wrong hands. They must fulfill specific governmental and regulatory requirements in order to stay in business. Though not every company dedicates the time, effort, and funding required, the need is generally quite obvious.

Less obvious, though, is the need for information governance. Companies assign value to data by default when they decide that it is important to secure it. That is part of risk management - how much is it worth in dollars, reputation, etc. to have specific data stolen. However, there is also a huge amount of potential value in information that tends to go un-quantified. A lot of companies understand that they should use the data they produce to do something, but they have a difficult time building the business case and the large-scale implementation plan required to achieve what they need. This is where information governance helps. Though the initiative still requires IT support via the appropriate tools, information governance is much more business-centric. It focuses on data classification and metadata, data architecture and availability, data quality both via business and technical processes, data awareness and literacy, information life cycle management, stewardship, and value creation. As noted previously, there is some overlap with data governance, and there are certainly technical aspects typically handled by IT. However, the initiative is really about business colleagues helping to produce clean data by using their platforms correctly and then having access to that data with a high level of literacy and a sense of ownership so that they can ensure that it is of high quality and use it to create information, knowledge, and eventually value. Information governance is the framework that allows a company to assess their current state, create a plan for the future, and track progress towards the final goals. It is also a great basis for creating a business case for undertaking something like building a Data and Analytics program.

This leads to my second and third arguments for having separate data and information governance initiatives.

2) Information governance is often overlooked. Most companies have some sort of structured plan for addressing security, risk management, etc. around their data, but they frequently do not have a structured plan to address the actual use of that data to turn it into information and knowledge so that they can derive value. Splitting the initiatives ensures that ample focus can be applied to both.

3) There is a separation of duties required to focus on the appropriate aspects of governing data and governing information that can best be achieved by having IT run the data governance while folks on the business side run the information governance. Of course, there is some overlap and significant collaboration required to be successful.

No comments: