
Getting to the Heart of Governance for Today’s Data-Driven Business

(spritekiku/Shutterstock)
The phrase “big data” is probably the understatement of the century. Data today isn’t just big, it’s overwhelming. Studies show that 90% of the world’s data was created in just the past two years. If today your data seems big – wait until your organization starts drinking from the Internet of Things firehose. The flood of information keeps growing.
Simultaneously, so do the challenges of keeping this vast amount of data timely, accurate and trustworthy. Organizations can’t afford the risk of losing control over and access to reliable and trustworthy data. This is particularly critical considering the competitive pressures of meeting customer expectations and looming regulatory and legislative deadlines.
Still, most businesses approach data governance from the wrong angle. So many times, the conversation revolves around technical, system-oriented challenges and procedures instead of the business case. And it’s no surprise IT saves the day by implementing data warehousing and data management tools that provide some metadata and technical data lineage capabilities. But in reality, these tools are just quick (and limited) fixes that address only the immediate needs of the organization. If the business wants to be data-driven, what you really need is a business capability to make sense of data.
The current fragmented approach involves integrating systems and moving data based on requirements and analysis of sources and targets — as opposed to establishing rules, standards and policies around how the data will be used by the business users within different departments. In addition, IT typically records their findings and designs in a flurry of archaic paper documents that detail how the data will be moved and how frequently (daily vs. hourly vs. real time), quality thresholds that need to be respected, which rules need to be checked, and more. After analysis and design, the solution needs to be implemented and someone in IT builds the code. Finally, the solution is tested before it goes into production. At each of these points the organization knows exactly where the data came from and how it moves between systems.
But what happens later, when there are new requests for specific information, such as a bank or healthcare organization scrambling to meet a new regulatory deadline? The staff who worked on the original project may have moved on, or the documentation of the design is misplaced, or worse, completely missing. Trying to go back into your businesses’ multiple databases and reconstruct where information has come from and whose hands have touched it after it’s been created is a time-consuming, expensive, and imperfect process. It’s like your basement is flooded, and now you’re down there with a mop to try to clean up the mess. Throwing more IT man- and technology power – essentially a bigger mop — at a data cleanup project isn’t going to help.

(Tashatuvango/Shutterstock)
The better approach is to proactively stop the water from flooding the basement in the first place. This can be accomplished by putting an automated and systematic control process in place right from the start, or formalizing the process already in place by integrating the business case with IT interaction.
This much-needed, technology-enabled approach to data governance takes an enterprise-wide, sophisticated and systematic approach to handling data, where many user groups are involved across the organization to ensure the availability, usability, integrity and security of the data, critical for businesses leveraging big data as an asset and staying in line with regulatory compliance demands. And most important, data governance creates an agreed upon, collaborative and executable framework, or operating model, for determining enterprise-wide policies, business rules and assets for the data governance team (including the chief data officer, stewardship committee and working groups) and business users to follow.
Adopting an operating model of policies and standards for governing data goes beyond IT’s paper-based trail of data movement between systems, and at a much larger scale, helps determine data inventory, data ownership, critical data elements (CDE), data quality, information security, data lineage and data retention. It’s through this operating model, including data lineage, where critical insights can be gained and any user, at any moment, can see where each piece of data has come from, which users have interacted with it, where it’s going, and what other databases it will feed into.
Here are five important points to keep in mind when putting your data governance policies in place to ensure significant impact from your initiative:
- Technical metadata (i.e., columns, tables, processes, repositories)alone is not enough to help a DBA or data steward understand and model the data in a way that allows efficient data management. A semantic layer needs to be built on top of the metadata to offer meaning to the data for proper data modeling and better data performance. This is why it’s essential for data governance to be an integrated process, with the business side of the organization working hand-in-hand with IT.
- Lineage of data from source to target systems along with transformations, as well as to business metadata like business term definitions and rules, is critical to data stewards and for technical purposes. However, traceability, or a 360 degree view on data assets, is essential for business users looking to answer questions like “Where does my data come from? What policies were used? What standards are applied?” For instance, policy managers will want to see the impact of their security policy on the different data domains, ideally before they enforce the policy; analysts want to have a high level overview of where the data comes from, what systems and what rules were applied; and an auditor might want to see a trace of a data issue to the impacted systems and business processes. Traceability is essential to get the more insightful answers that straight lineage alone can’t provide.
- Use true enablingartifacts such as mapping specifications and data sharing agreements to proactively drive the process. By driving the movement of data from the business needs you create transparency and control. SLAs included in the data sharing agreements establish clear ownership and accountability between data producers and consumers, which is a cornerstone of trust and agility.
- Create system sensors: control points that scan the data source and target systems when something has changed, and automatically notify data stewards when an issue is identified. By alerting the data team of any changes made to the system gives data stewards and others in the organization time to react to the changes, either by making adjustments to rules and standards for governing the data, and/or to avoid a larger issue from occurring in data performance. The business can now deal with data exceptions, rather than having to deal with exceptions as the business.
- Implement a data governance platform that not only smoothly integrates the landscape of surrounding tools and techniques, but is scalable and adaptable to quickly meet the evolving needs of a business. This will help reduce operational costs and ensure a laser focus on data quality, while also eliminating the need to rely on IT to scramble for answers when it comes to data requests and exceptions.
Data governance isn’t only about risk management. It’s about getting to the heart of your data and making it easier for everyone in the organization to use and trust the data for business advantage. A good data governance system will not only proactively prevent problems, but will make it easier for users throughout your company to look at your data in a more intuitive, understandable way. Data governance is a framework for setting data-usage policies and implementing controls designed to ensure that information remains accurate, consistent and accessible in a consistent and timely manner so your company is in the driver’s seat to capitalize on the many opportunities made available through big data.
About the author: Stan Christiaens is Co-founder and Chief Technology Officer of data governance software developer Collibra.
Related Items:
Bridging the Trust Gap in Data Analytics
This Catalog Recommends Data with Machine Learning
Why Integration and Governance Are Critical for Data Lake Success
July 1, 2025
- Nexdata Presents Real-World Scalable AI Training Data Solutions at CVPR 2025
- IBM and DBmaestro Expand Partnership to Deliver Enterprise-Grade Database DevOps and Observability
- John Snow Labs Debuts Martlet.ai to Advance Compliance and Efficiency in HCC Coding
- HighByte Releases Industrial MCP Server for Agentic AI
- Qlik Releases Trust Score for AI in Qlik Talend Cloud
- Dresner Advisory Publishes 2025 Wisdom of Crowds Enterprise Performance Management Market Study
- Precisely Accelerates Location-Aware AI with Model Context Protocol
- MongoDB Announces Commitment to Achieve FedRAMP High and Impact Level 5 Authorizations
June 30, 2025
- Campfire Raises $35 Million Series A Led by Accel to Build the Next-Generation AI-Driven ERP
- Intel Xeon 6 Slashes Power Consumption for Nokia Core Network Customers
- Equal Opportunity Ventures Leads Investment in Manta AI to Redefine the Future of Data Science
- Tracer Protect for ChatGPT to Combat Rising Enterprise Brand Threats from AI Chatbots
June 27, 2025
- EarthDaily Ignites a New Era in Earth Observation with Landmark Satellite Launch
- Domo Deepens Collaboration with Snowflake to Accelerate AI-Driven Analytics and Data Integration on the AI Data Cloud
- AIwire Launches Annual People to Watch Program
June 26, 2025
- Thomson Reuters: Firms with AI Strategies Twice as Likely to See AI-driven Revenue Growth
- DataBahn Raises $17M Series A to Advance AI-Native Data Pipeline Platform
- BCG Report: Companies Must Go Beyond AI Adoption to Realize Its Full Potential
- H2O.ai Breaks New World Record for Most Accurate Agentic AI for Generalized Assistants
- Foresight Raises $5.5M Seed Round to Bring Unified Data and AI to the Private Market
- Inside the Chargeback System That Made Harvard’s Storage Sustainable
- What Are Reasoning Models and Why You Should Care
- Databricks Takes Top Spot in Gartner DSML Platform Report
- Snowflake Widens Analytics and AI Reach at Summit 25
- Why Snowflake Bought Crunchy Data
- LinkedIn Introduces Northguard, Its Replacement for Kafka
- Change to Apache Iceberg Could Streamline Queries, Open Data
- Agentic AI Orchestration Layer Should be Independent, Dataiku CEO Says
- Top-Down or Bottom-Up Data Model Design: Which is Best?
- The Evolution of Time-Series Models: AI Leading a New Forecasting Era
- More Features…
- Mathematica Helps Crack Zodiac Killer’s Code
- AI Agents To Drive Scientific Discovery Within a Year, Altman Predicts
- ‘The Relational Model Always Wins,’ RelationalAI CEO Says
- Confluent Says ‘Au Revoir’ to Zookeeper with Launch of Confluent Platform 8.0
- DuckLake Makes a Splash in the Lakehouse Stack – But Can It Break Through?
- Solidigm Celebrates World’s Largest SSD with ‘122 Day’
- The Top Five Data Labeling Firms According to Everest Group
- Supabase’s $200M Raise Signals Big Ambitions
- Data Prep Still Dominates Data Scientists’ Time, Survey Finds
- Toloka Expands Data Labeling Service
- More News In Brief…
- Astronomer Unveils New Capabilities in Astro to Streamline Enterprise Data Orchestration
- BigID Reports Majority of Enterprises Lack AI Risk Visibility in 2025
- Databricks Unveils Databricks One: A New Way to Bring AI to Every Corner of the Business
- Snowflake Openflow Unlocks Full Data Interoperability, Accelerating Data Movement for AI Innovation
- Astronomer Introduces Astro Observe to Provide Unified Full-Stack Data Orchestration and Observability
- Seagate Unveils IronWolf Pro 24TB Hard Drive for SMBs and Enterprises
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- Databricks Donates Declarative Pipelines to Apache Spark Open Source Project
- Code.org, in Partnership with Amazon, Launches New AI Curriculum for Grades 8-12
- Snowflake Unveils Next Wave of Compute Innovations for Faster, More Efficient Warehouses and AI-Driven Data Governance
- More This Just In…