Getting to the Heart of Governance for Today’s Data-Driven Business
The phrase “big data” is probably the understatement of the century. Data today isn’t just big, it’s overwhelming. Studies show that 90% of the world’s data was created in just the past two years. If today your data seems big – wait until your organization starts drinking from the Internet of Things firehose. The flood of information keeps growing.
Simultaneously, so do the challenges of keeping this vast amount of data timely, accurate and trustworthy. Organizations can’t afford the risk of losing control over and access to reliable and trustworthy data. This is particularly critical considering the competitive pressures of meeting customer expectations and looming regulatory and legislative deadlines.
Still, most businesses approach data governance from the wrong angle. So many times, the conversation revolves around technical, system-oriented challenges and procedures instead of the business case. And it’s no surprise IT saves the day by implementing data warehousing and data management tools that provide some metadata and technical data lineage capabilities. But in reality, these tools are just quick (and limited) fixes that address only the immediate needs of the organization. If the business wants to be data-driven, what you really need is a business capability to make sense of data.
The current fragmented approach involves integrating systems and moving data based on requirements and analysis of sources and targets — as opposed to establishing rules, standards and policies around how the data will be used by the business users within different departments. In addition, IT typically records their findings and designs in a flurry of archaic paper documents that detail how the data will be moved and how frequently (daily vs. hourly vs. real time), quality thresholds that need to be respected, which rules need to be checked, and more. After analysis and design, the solution needs to be implemented and someone in IT builds the code. Finally, the solution is tested before it goes into production. At each of these points the organization knows exactly where the data came from and how it moves between systems.
But what happens later, when there are new requests for specific information, such as a bank or healthcare organization scrambling to meet a new regulatory deadline? The staff who worked on the original project may have moved on, or the documentation of the design is misplaced, or worse, completely missing. Trying to go back into your businesses’ multiple databases and reconstruct where information has come from and whose hands have touched it after it’s been created is a time-consuming, expensive, and imperfect process. It’s like your basement is flooded, and now you’re down there with a mop to try to clean up the mess. Throwing more IT man- and technology power – essentially a bigger mop — at a data cleanup project isn’t going to help.
The better approach is to proactively stop the water from flooding the basement in the first place. This can be accomplished by putting an automated and systematic control process in place right from the start, or formalizing the process already in place by integrating the business case with IT interaction.
This much-needed, technology-enabled approach to data governance takes an enterprise-wide, sophisticated and systematic approach to handling data, where many user groups are involved across the organization to ensure the availability, usability, integrity and security of the data, critical for businesses leveraging big data as an asset and staying in line with regulatory compliance demands. And most important, data governance creates an agreed upon, collaborative and executable framework, or operating model, for determining enterprise-wide policies, business rules and assets for the data governance team (including the chief data officer, stewardship committee and working groups) and business users to follow.
Adopting an operating model of policies and standards for governing data goes beyond IT’s paper-based trail of data movement between systems, and at a much larger scale, helps determine data inventory, data ownership, critical data elements (CDE), data quality, information security, data lineage and data retention. It’s through this operating model, including data lineage, where critical insights can be gained and any user, at any moment, can see where each piece of data has come from, which users have interacted with it, where it’s going, and what other databases it will feed into.
Here are five important points to keep in mind when putting your data governance policies in place to ensure significant impact from your initiative:
- Technical metadata (i.e., columns, tables, processes, repositories)alone is not enough to help a DBA or data steward understand and model the data in a way that allows efficient data management. A semantic layer needs to be built on top of the metadata to offer meaning to the data for proper data modeling and better data performance. This is why it’s essential for data governance to be an integrated process, with the business side of the organization working hand-in-hand with IT.
- Lineage of data from source to target systems along with transformations, as well as to business metadata like business term definitions and rules, is critical to data stewards and for technical purposes. However, traceability, or a 360 degree view on data assets, is essential for business users looking to answer questions like “Where does my data come from? What policies were used? What standards are applied?” For instance, policy managers will want to see the impact of their security policy on the different data domains, ideally before they enforce the policy; analysts want to have a high level overview of where the data comes from, what systems and what rules were applied; and an auditor might want to see a trace of a data issue to the impacted systems and business processes. Traceability is essential to get the more insightful answers that straight lineage alone can’t provide.
- Use true enablingartifacts such as mapping specifications and data sharing agreements to proactively drive the process. By driving the movement of data from the business needs you create transparency and control. SLAs included in the data sharing agreements establish clear ownership and accountability between data producers and consumers, which is a cornerstone of trust and agility.
- Create system sensors: control points that scan the data source and target systems when something has changed, and automatically notify data stewards when an issue is identified. By alerting the data team of any changes made to the system gives data stewards and others in the organization time to react to the changes, either by making adjustments to rules and standards for governing the data, and/or to avoid a larger issue from occurring in data performance. The business can now deal with data exceptions, rather than having to deal with exceptions as the business.
- Implement a data governance platform that not only smoothly integrates the landscape of surrounding tools and techniques, but is scalable and adaptable to quickly meet the evolving needs of a business. This will help reduce operational costs and ensure a laser focus on data quality, while also eliminating the need to rely on IT to scramble for answers when it comes to data requests and exceptions.
Data governance isn’t only about risk management. It’s about getting to the heart of your data and making it easier for everyone in the organization to use and trust the data for business advantage. A good data governance system will not only proactively prevent problems, but will make it easier for users throughout your company to look at your data in a more intuitive, understandable way. Data governance is a framework for setting data-usage policies and implementing controls designed to ensure that information remains accurate, consistent and accessible in a consistent and timely manner so your company is in the driver’s seat to capitalize on the many opportunities made available through big data.
About the author: Stan Christiaens is Co-founder and Chief Technology Officer of data governance software developer Collibra.
Related Items:
Bridging the Trust Gap in Data Analytics
This Catalog Recommends Data with Machine Learning
Why Integration and Governance Are Critical for Data Lake Success
March 28, 2024
- Elastic Announces 2023 Elastic Excellence Awards Winners
- Woolpert Acquires Ireland-Based Murphy Geospatial, a Leading European Geospatial Solutions Firm
- WiDS Livermore Conference Attendees Network, Share Research and Absorb Wisdom
- Observe Announces $115M In Series B Financing
- Lightning AI’s New Thunder Compiler Boosts AI Development Efficiency by 40%
- Intel Gaudi 2 Remains Only Benchmarked Alternative to NV H100 for GenAI Performance
- Appen Launches Solution for Enterprises to Customize LLMs
- MineOS Unveils AI Asset Discovery
- Cloudera Survey Reveals 90% of IT Leaders Believe that Unifying the Data Lifecycle on a Single Platform is Critical for Analytics and AI
- Snowflake Enhances Secure, Cross-Cloud Collaboration for High Value Business Outcomes with Snowflake Data Clean Rooms
- Domo Announces Winners of the 2024 Community Ovation Awards
March 27, 2024
- New MLPerf Inference Benchmark Results Highlight the Rapid Growth of Generative AI Models
- Qlik Advances Real-time Data Analytics with Solace PubSub+ Platform Integration
- Samsung Unveils Expanded CXL Memory Module Portfolio at Memcon 2024, Enhancing AI and HPC
- Celestial AI Closes $175M Series C Funding Round Led by US Innovative Technology Fund
- Databricks Launches DBRX: A New Standard for Efficient Open Source Models
- Astronomer Unveils New Capabilities in Astro to Streamline Enterprise Data Orchestration
- DataVisor Introduces Enhanced Anti-Money Laundering Solution to Support Financial Institutions
March 26, 2024
Most Read Features
Sorry. No data so far.
Most Read News In Brief
Sorry. No data so far.
Most Read This Just In
Sorry. No data so far.
Sponsored Partner Content
-
Supercharge Your Data Lake with Spark 3.3
-
Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]
-
Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]
-
Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023
-
The Art of Mastering Data Quality for AI and Analytics
Sponsored Whitepapers
Contributors
Featured Events
-
Data Universe
April 10 - April 11New York United States -
Call & Contact Center Expo
April 24 - April 25Las Vegas NV United States -
AI & Big Data Expo North America 2024
June 5 - June 6Santa Clara CA United States -
AI Hardware & Edge AI Summit 2024
September 10 - September 12San Jose CA United States