Five Tips for Winning at Data Governance
By now, you’ve hopefully realized that having a great data analytics strategy also requires having a good data governance strategy. After all, if your data is ungovernable, then the analytics you’re running atop it will be untrustworthy at the end of the day.
So what does a winning data governance strategy look like? There are many elements that go into having a successful data governance strategy, but here are five aspects that should not be overlooked:
1. Embrace Data Silos
For many years, data silos were enemy #1 to data analysts and data scientists. When data is spread out across many repositories, the thinking went, designing and building data analytics applications becomes so much more difficult.
Okay, so it’s true: data silos do make life more challenging for data scientists and data analysts. But here’s the thing: There’s not much anybody can do it about it. We’ve been instruct to build lakes atop big data stores like Hadoop and centralize as much of the data as possible, and while theoretically it is a superior approach, in practice – not so much.
“Certainly, everybody wants to get to a single point of reference for the enterprise across all their data sources,” says Stephanie McReynolds, vice president of marketing at Alation, a provider of data catalog and governance software. “But as you well know, there’s been such a proliferation of data sources across the past five to seven years that it’s impossible to start by consolidating everything into one place.”
But it’s not all bad. Here’s some good news: Modern data governance tools from Alation and other vendors don’t really care where the data is, and they help to streamline the process of cataloging and then accessing the data, no matter where it resides.
McReynolds is seeing continued adoption of Hadoop data lakes, but she’s also seeing organizations moving data from on-premise repositories into cloud-based stores. Data governance tools like Alation’s track that data wherever it goes.
“Alation is the consistent interface, no matter where that data lies,” she says. “That reduces the boundaries or the barriers for analysts and data scientists to access that data. They don’t have to be concerned about, on a day by day basis, where data sets have been moved to. They can find it no matter where it lies physically.”
2. Plan for Bigness
Data analytics projects usually start small, but they can grow quickly if the returns are positive. Organizations that have that scale in mind when they take projects into production are better equipped to solve data governance challenges down the line, says Emily Washington, senior vice president of product management and Infogix, a company that makes data analytics and data management software.
“When organizations start collecting data, it’s often on a small scale that is manageable,” Washington tells Datanami. “But as decisions are made based on analytical results, organizations start to see the real value in data collection and, as you guessed it, start collecting more data. But as they amass more data, organizations often lose control of their data’s quality, origin, ownership — all key components to a successful data governance program.”
The first step to solving this common hurdle is to ensure that everyone is speaking the same language. “Data dictionaries, business glossary, and data lineage not only define data and terms across varied business units, but also provide key information about the source, age, and interdependencies of data,” Washington says.
A data governance strategy in place that addresses the entire spectrum of concerns – including not just defining data and terms but also laying out the sources of data, how it’s used, the relationships between the data sources, data quality dimensions and scores, and who the owners and stewards of the data are — is critical for getting stakeholders buy-in, she says.
“If users don’t know where data originated, they probably won’t trust it, and they generally won’t use it,” Washington says.
3. Reconsider Self-Serve Analytics
One of the hallmarks of the big data analytics industry has been the desire to push data and analytics out to as many people as possible. The more people within an organization who have access to data and self-service analytics, the better, the thinking goes.
While self-service analytics is an admirable long-term goal, it has caused some data governance challenges in the near-term, McReynolds says.
“There are two different angles to getting your data figured out,” she tells Datanami. “There’s how you organize the raw source data. The second angle, which is related but maybe not as obvious, is what’s happening on the consumption side, and what a mess a self-service analytics has actually created in the short term.”
Organizations are realizing competitive advantages by putting data in front of more employees, but it’s taking a toll, McReynolds says. “It was great that we allowed so many individuals in the organizations to have access to data and we encouraged them to make data-driven decisions,” she says. “But the implantation of Tableau and Qlik and some of the more modern self-service analytics tools have also created a little bit of a hairball of insights that don’t link cleanly to that raw source data.”
Implementing a data governance strategy can help smooth out some of the rough edges created on both fronts. “On both of those ends — the final mile and the first mile — we have data governance challenges to address this year.”
4. GDPR Is Just the Start
There are 113 days until the General Data Protection Regulation (GDPR) goes into effect. If you haven’t started a remediation program yet, the odds of finishing by May 25 are slim to none. But don’t fret, says Felix Van de Maele, the CEO of Collibra, which makes software for governing and cataloging data.
“May 25 is the deadline for GPDR, but it’s not the end. It’s really the start,” Van de Maele says. “I think what you’ll see is there’s going to be a couple of examples that regulators will want to set to make sure that companies understand that it is important and they expect them to be compliant.”
That doesn’t mean you should delay thinking about data governance until the European Commission’s competition committee fines Facebook or Google (or whoever it might be). Instead, customers should view GDPR compliance and data governance as an ongoing challenge that will change and evolve over time.
“I think what we’ll see after the [GDPR] deadline is another phase of adoption of technology to help make those programs more sustainable,” Alation’s McReynolds says. “The focus will not just be on inventorying data…but being able to tag and manage that data as the requests come in from consumers to pull their own personal consumer data out of a training algorithm or a live algorithm. We’ll start to see systems put in place to mange to those requirements over time.”
5. Empower Your CDO
One of the most overlooked aspects of succeeding with a big data governance strategy is having the right personnel in place. While there’s no single combination of players that automatically leads to success, the odds will be stacked in your favor if you have somebody in the chief data officer (CDO) role.
“Ideally, there should be a CDO to oversee and facilitate the execution of a data governance program to ensure executive sponsorship,” Infogix’s Washington says. “Other direct participants should include executive leadership, project management, data stewards (which may reside in IT or within a more federated model across lines of business and functions) and subject matter experts. But given the objectives of data governance, all roles within the organization have a part to play in its implementation success. The key to data governance is to ensure collaboration across the enterprise.
The CDO will be responsible for balancing the competing priorities that modern data analytic teams will eventually have to deal with – specifically, making sure that data analytics aren’t too locked down with overly regimented processes on the one hand, and that they don’t proliferate too wildly without enough processes on the other.
“I think the CDO is the right role for this to sit under,” McReynolds says. “There’s a balance that you want to get between top down enforcement and bottoms up grassroots usage of the data. If you move that responsibility outside of the CDO, I think that you risk moving too aggressively in one direction, or the other. Locked-down analytic environments don’t lend themselves well to finding new algorithms and new strategic analytic insights.”
There are obviously a lot more aspects to succeeding in data governance than we covered here, but hopefully these insights provide you something to think about as you evolve your data governance strategy.