Data Governance: Simple and Practical
In 2022, the global data governance market grew to $3.2 billion, a figure that’s dwarfed by the significant cost to operate data governance programs. In spite of the spend, many programs fail to generate high business value because they lack practicality and simplicity: the art of maximizing the amount of work not done.
Where data governance programs often get off track is right before they get on, by defining the charter and roadmap too broadly. The key question is which elements of a data governance program should you chase out of the gate: ownership and stewardship? Committees and councils? Policies and standards? Data quality? Metadata or master data management? Data security and access? Issue management? Technology?
Choose the single element fundamental to fulfill the primary purpose of the program. To do this, answer two questions: what outcomes does the business need and what are the most urgent? From there, codify the primary purpose in your data governance charter and establish its pursuit as the first leg of your roadmap.
Focus on the Purpose and Postpone the Rest
Purpose-driven data governance programs narrow their focus to deliver urgent business needs and defer much of the rest, with a couple caveats. First, data governance programs are doomed to fail without senior executive buy-in and continuous engagement of key stakeholders. Without them, no purpose can be fulfilled. Second, data governance programs must identify and gain commitment from relevant (but perhaps not all) data owners and stewards, but that doesn’t necessarily mean roles and responsibilities need to be fully fleshed out right away.
Identify the primary purpose then focus on it – sounds like a simple formula, but it’s not obvious. Many data governance leaders are quick to define and pursue their three practices or five pillars or seven elements, and why shouldn’t they? They need those capabilities, but wanting it all comes at the sacrifice of getting it now. Generate business value with your primary purpose before expanding.
I’ve consulted with multiple customers whose different symptoms trace back to the same blight: a tech company that couldn’t get the trust in trusted data, an insurer that struggled to improve data quality, and a multinational financial services firm that couldn’t remain compliant.
Each approached data governance with a big bang approach instead of simplistically and practically.
A tech company described their struggles to me this way: in quarterly business reviews cross-functional leaders couldn’t agree on the numbers, what metrics meant, where to source them, or how to interpret them. The primary purpose of their data governance program needed to be: to improve trust in the data. Their pathway looked like:
- Identify the critical data elements in dispute
- Negotiate and agree to data definitions; codify them in business glossaries
- Make those data elements available in source of truth datasets (if you show up to QBR with a different number, you’re wrong)
- Repipe widely used dashboards to the source of truth
Simplicity. The data governance elements missing from this pathway were everything that didn’t swiftly move the needle on trusted data. What use are policies without trust? Why improve the quality of data that no one understands?
An insurer explained to me their dashboards weren’t always refreshed, and when they were, wide fluctuations in values made it impossible to make informed decisions. Their primary purpose needed to be: to improve the quality of the data. Their pathway looked like:
- Identify unreliable dashboards and critical data elements
- Uncover the most adversely impacting data quality dimensions, for instance:
- Data Timeliness – Frequently failing pipelines without service-level agreements (SLAs) mean users can’t gauge data staleness
- Data Accuracy – Improper tooling to exchange data over a hybrid multi-cloud environment leads to data interpretation inconsistency
- Benchmark data quality measurements; put profiling and dashboarding in place
- Begin the search for the right hybrid multi-cloud technology vendor
- Improve data quality: define SLAs, reschedule pipelines, standardize data assets
Simplicity. Missing from the pathway were committees and councils, which might improve quality, but not as directly or quickly as attacking the root cause head on.
For a financial services firm, new regulations were inconsistently interpreted and wastefully implemented across the globe. Their primary purpose needed to be: to become truly compliant. Their pathway looked like:
- Establish data governance committees, councils, and working groups with clear responsibilities and accountabilities
- Document policies and standards around compliance definitions and implementation methods
- Implement and enforce
About the author: Shayde Christian is the Chief Data & Analytics Officer for Cloudera, where he guides data-driven cultural change to generate maximum value from data. He enables Cloudera customers to get the absolute best from their Cloudera products such that they can generate high value use cases for competitive advantage. Previously a principal consultant, Shayde formulated data strategy for Fortune 500 clients and designed, constructed, or turned around failing enterprise information management organizations. Shayde enjoys laughter and is often the cause of it.
December 7, 2023
- Sprinklr Empowers Businesses to Deploy and Scale Generative AI-powered Conversational Bots
- KNIME Releases Improved UI, Enhanced AI Assistant, Modernized Scripting Experience with AI, and More
- EY Report Highlights: Generational Divide in AI Adoption and Perception in the Workforce
- Bigeye Receives Strategic Investment from Alteryx Ventures
December 6, 2023
- Astronomer Unveils Latest Astro Release with Advanced Security and Cost-Savings Features
- Asato Secures $7.5M Investment to Support Development of AI Copilot Platform
- AMD Instinct MI300 Series Launch: Accelerating Next-Gen AI and Supercomputing
- SQream Achieves SOC-2 Type II Compliance Certification for Its Cloud-Native Data Lakehouse ‘Blue’
- Ataccama Announces ONE AI for Improved Automated Data Governance
- 10% of Organizations Surveyed Launched GenAI Solutions to Production in 2023
- SingleStore to Launch Hybrid Vector and Full-Text Search Capabilities as a Snowflake Native App on the Snowflake Data Cloud
- Snowplow Launches Snowplow Digital Analytics as a Snowflake Native App, in the Data Cloud
- Hitachi Vantara Launches Unified Compute Platform Integrated with GKE Enterprise to Simplify Hybrid Cloud Management
- Red Hat Reports: IT Modernization and Open Source Adoption Key to Overcoming Skills Shortfalls
December 5, 2023
- Nexusflow Unveils NexusRaven-V2, Offering Advanced Software Tool Use Beyond GPT-4 Capabilities
- Alteryx Research Outlines the Challenges Facing the Enterprise of the Future
- Unravel Data Partners with Databricks for Lakehouse Observability and FinOps
- Mine Secures $30M in Series B Funding to Transform Data Privacy Governance for Enterprises
- Comet Now Available on Amazon Marketplace to Help Organizations Achieve Their Business Goals with ML
- Pluralsight Study Shows Disparity Between AI Initiatives and Workforce Readiness
Most Read Features
- Databricks Bucks the Herd with Dolly, a Slim New LLM You Can Train Yourself
- Big Data File Formats Demystified
- Altman’s Back As Questions Swirl Around Project Q-Star
- Data Mesh Vs. Data Fabric: Understanding the Differences
- Quantum Computing and AI: A Leap Forward or a Distant Dream?
- Patterns of Progress: Andrew Ng Eyes a Revolution in Computer Vision
- AWS Adds Vector Capabilities to More Databases
- Taking GenAI from Good to Great: Retrieval-Augmented Generation and Real-Time Data
- Five AWS Predictions as re:Invent 2023 Kicks Off
- How Generative AI Is Transforming the Call Center Market
- More Features…
Most Read News In Brief
- Mathematica Helps Crack Zodiac Killer’s Code
- Databricks: We’re a Data Intelligence Platform Now
- Pandas on GPU Runs 150x Faster, Nvidia Says
- GenAI Debuts Atop Gartner’s 2023 Hype Cycle
- Retool’s State of AI Report Highlights the Rise of Vector Databases
- Amazon Launches AI Assistant, Amazon Q
- AWS Launches High-Speed Amazon S3 Express One Zone
- New Data Unveils Realities of Generative AI Adoption in the Enterprise
- Big Growth Forecasted for Big Data
- Anaconda’s Commercial Fee Is Paying Off, CEO Says
- More News In Brief…
Most Read This Just In
- Salesforce Announces New Automotive Cloud Features
- Martian Raises $9M for Advanced Model Mapping to Enhance LLM Performance and Accuracy
- DataStax Launches New Integration with LangChain, Enables Developers to Build Production-ready Generative AI Applications
- Dremio Delivers GenAI-Powered Data Discovery and Unified Path to Apache Iceberg on the Data Lakehouse
- HPE Collaborates with NVIDIA to Deliver an Enterprise-Class, Full-Stack GenAI Solution
- Voltron Data Launches Theseus to Unlock the Power of the Largest Data Sets for AI
- Amazon Aurora MySQL zero-ETL Integration with Amazon Redshift Now Generally Available
- Terra Quantum Announces Partnership with NVIDIA for Quantum-Enhanced Data Analytics
- AWS Announces 4 Zero-ETL Integrations to Make Data Access and Analysis Faster and Easier Across Data Stores
- AMD Instinct MI300 Series Launch: Accelerating Next-Gen AI and Supercomputing
- More This Just In…