Governing Consumer Data to Improve Quality and Enforce Privacy
Consumer data is at the center of today’s digital economy. Arguably, it has become the world’s most valuable resource. Alphabet (Google’s parent company), Amazon, Apple, Facebook and Microsoft are the world’s most valuable companies, thanks to the enormous clout and control that their consumer data gives them.
Data fuels a widening circle in organizations that systematically use data-driven insights to inform and influence their business strategy. As The Economist puts it, “By collecting more data, a firm has more scope to improve its products, which attracts more users, generating even more data, and so on.” Managing consumer data well is therefore an absolute must, and this is true is true for all companies, not just for large global businesses.
This article explores two aspects of consumer data management that are crucial: improving its quality and enforcing its privacy. In particular, we will delve into the following topics:
- Examine common challenges;
- Show how master data management (MDM) and data privacy technology can help;
- And make the case for applying common data governance processes and tools.
Challenges to Consumer Data Quality and Privacy
Bad data quality has long been recognized as the main challenge facing organizations that want to become data-driven. It is the root cause of a litany of problems, from wrong business decisions to missed opportunities to lack of regulatory compliance.
On the other hand, privacy of consumer data has become the hot topic of the day. Cybersecurity attacks are on the rise, consumer confidence is eroding, and regulators are reacting. After GDPR and CCPA, this report and others cite numerous regulations that are yet to come to the U.S., both at the state level and perhaps even at the federal level. Undoubtedly, new laws will expand the scope of data elements worthy of, and subject to, privacy protections.
Both quality and privacy of consumer data are tough problems to solve. What makes it even more difficult is that organizations today have scattered consumer data landscapes. Consumer data is captured from a variety of front-end applications (call center software, social media, web registrations, and so on) and from external sources. Scattered consumer data elements worthy of privacy protections that don’t benefit from proper cleansing or standardization is in itself a big challenge.
Integration can be helpful. It may seem easy to integrate consumer data sets in a Single-Source-of-Truth (SSoT) data warehouse. After all, consumer data is held in about a dozen fields, right? Not really.
First, many businesses naturally get to know quite a bit about their customers. Second, leaders now realize the strategic importance of customer data fueling the circle we mentioned. They want to listen to their customers– through data — in every possible way. At Persistent, our B2C customers have a plethora of sources supporting a specific functional area, including data lakes (where the quality and privacy problem may be exacerbated, as they can ingest data without conforming to any schema). Finally, the cloud is adding yet another dimension of heterogeneity to their data landscapes.
How Technology Can Help
Of course, not all companies are equal, and they all have different requirements for consumer data management in the following key areas:
- Number of sources storing (and applications managing) consumer data;
- Number and volume of consumer records;
- Variety of formats and data elements;
- Data quality gap between current state and state fit for purpose;
- Diversity of foreseen treatments needed to request consumer consent;
- And sensitivity of data elements.
Depending on how stringent the requirements, you may benefit from master data management (MDM) technology and from data privacy tools.
With even a few consumer data sources in a variety of formats and of questionable quality, MDM technology will improve quality and distribute the data to consumer management applications and/or to your downstream analytics platform to support SSoT-style integration for structured consumer data. Solutions to build a 360 degree view of the customer needing non-master and often unstructured consumer data (e.g., from website and social activity) often enhance MDM programs to obtain trusted, accessible and linked information.
When there are multiple sources of consumer data in various formats, privacy compliance regulations, and diverse request consent requirements, data privacy technology will help automate consumer privacy rights, set up data security policies and enable compliance with regulation audits and protection requirements. When sensitive consumer data is stored, consumer rights, policies and processes become more complex in regulations such as GDPR.
Why You Need Data Governance
However, and this is the main point we want to stress, in all cases you need data governance (DG). Unfortunately, DG is sometimes viewed as synonymous with excessive control. In reality, DG is about enabling the business with valuable information from data, while complying with regulations affecting the organization, notably privacy regulations.
DG is a program that confers rights and accountabilities that make data trustable and secure. Policies and standards define the “what” – what the organization does to improve data quality and enforce data privacy (and data management, in general).
Then, governance controls, technical or organizational, flesh out the “how” to carry out DG. Technical controls are implemented by technology, such as MDM or data privacy technology mentioned above. Authentication and access control tools are also needed to secure the data against unauthorized access. Data protection techniques such as encryption and masking also protect consumer data from unauthorized visibility.
Successful DG programs have well thought-out organizational controls. They define rights and accountabilities among people, drive definition of policies, drive coordination of process changes across functional areas, and support the necessary change to the organization’s culture through communication and training.
DG should not be seen as a “nice to have.” It addresses the root cause of why organizations struggle with data quality, namely the absence of ownership or accountability when enforcing quality policies. As for privacy, with the expanded definition of what constitutes personal information in laws such as GDPR, ownership, control and management of personal data is an ever-expanding governance endeavor.
Data Governance Tools
Nowadays, data governance is also supported by tools for cataloging data assets from data sources and for registering policies and standards for managing these assets. Some pure-play data catalog tools cover more ground. They support discovery of the right data assets for business initiatives, they can publish properties about these assets (purpose and lineage, i.e., where it comes from), and they can audit ad-hoc data requests and usage, as well as policy changes.
Catalog solutions are now quite popular, which conceals a hidden danger since some are tool-specific. Indeed, the newer tools such as data lake, data integration, BI, and MDM and data privacy, may have catalog functionality to improve data usability, trust and shareability in the context of that tool. For instance, an MDM tool’s catalog can publish the current state of quality of the assets under management, with respect to the defined quality policies, and the data privacy tool can communicate the privacy exposure of its consumer assets with respect to the privacy policies.
We are not against tactical catalog deployments associated with a tool. However, to support global strategic data and analytics needs and privacy regulations, we believe organizations must follow the best practice of having a single source of reference (SSoR) through a global data catalog supporting enterprise-wide DG. There should be a capability to scale beyond tactical catalog deployments to provide such an SSoR.
About the author: Fernando Velez is a software technologist with an extensive career in data management, in the database systems, analytics and information management domains. After a period as a researcher in France, Fernando built and delivered products to each of these domain markets, in companies such as Doubleclick (now part of Google), Business Objects and SAP. He is currently the Chief Data Technologist at Persistent Systems.