The Enlightening Side of GDPR Compliance
If the upcoming General Data Protection Regulation (GDPR) is like most government mandates, the thinking goes, it will be a tax on business and a hindrance to productivity. But if you take the right approach, some experts say, the GDPR could actually be a boon to analytics.
By now you’re probably aware that in just 38 days, the European Union begins enforcing GDPR, which was passed two years ago to harmonize a hodge-podge of privacy laws across Europe. The new regulation was designed to, among other things, give EU citizens control over their personal data by imposing stiff penalties for any organizations that use their personal data without consent. It also requires organizations to take certain measures to protect citizens’ personal data, which will (hopefully) reduce the severity and impact of major data breaches.
The law protects the data of 740 million Europeans, more than 10% of the Earth’s population. In other words, when it comes to big global corporations that sell goods and services around the planet, the days of playing fast and loose with peoples’ data are over. While we don’t have an American version of the GDPR (at least not yet), essentially the gig is up for the Wild West days of “anything goes” in the analytics arena.
In that regard, the GDPR is a significant wakeup call for organizations to mature their data management and analytic practices – and to do so quickly. While no corporate leader likes being ordered how to run her company, the prospect of losing up to 4% of annual revenue per GDPR violation provides a compelling incentive to accelerate data management initiatives.
The new regulation is also a boon for software companies that have the experience in building the types of consent-management systems that GDPR requires. The folks at MarkLogic, which develops a multi-model database management system, have been busy with GDPR remediation engagements and have noticed some patterns emerging from the work.
David Gorbet, SVP Engineering for MarkLogic, says GDPR finally caught up with people. “For a long time, they were unprepared,” he tells Datanami. “They didn’t underhand how it impacted them. They thought it was a Europe-only thing. A lot of people didn’t know how to get started.”
GDPR represents a very complex set of requirements that are vastly different from other regulations that impact the collection, storage, and use of data, such as HIPAA or PCI DSS, according to Gorbet: You can’t prove compliance by generating the correct reports. “It’s not a reporting regulation,” he says. “It’s an operationalization regulation. It’s about how you change your operation in order to be compliant with GDPR.”
That makes things tougher, to say the least. Being GDPR compliant requires an organization to understand where all their data resides and what it’s used for, Gorbet says. Then they need to relate that data to potentially millions of individuals, and then create consent management systems to control access to that data. “That’s a lot trickier than a typical compliance operation,” he says.
Over the years MarkLogic’s professional services arm has developed many of these types of consent management systems. It’s software manages consents for big publishing houses that need to understand which customers have legal authority to specific songs and movies. The system helps answer questions like, Is this iTunes consumer entitled to play this song, or is this the director’s commentary for this boxed set of Stephen Spielberg movies entitled to be distributed in this particular region?
Gorbet points out that consent management was also a big part of MarkLogic’s involvement in the development of Healthcare.gov, the much-maligned Obamacare website. Specifically, the MarkLogic database was used to create a Data Services Hub that reached out to databases hosted by the IRS, the Social Security Administration, and others to see whether or not a given citizen is eligible for a specific insurance program.
This is the sort of consent system that every large organization will ultimately need to comply with GDPR, Gorbet says, but building it is not easy. In some cases, customers will move all the relevant data into MarkLogic’s NoSQL database, which features a flexible schema. In other cases, a MarkLogic system will create a virtual layer controlling access to data that lives in other physical data stores, which could include relational databases, Hadoop file stores, object file systems, and mainframe flat files.
Staying on top of what people consent an organization to use their data for and what they do not is also a big challenge. “With GDPR, organizations can ask for consent at whatever level they want… and GDPR allows you to revoke them any time,” Gorbet says. “It’s not just a Boolean yes or no. It can be quite a complex thing.”
So for instance, a customer may give an organization the okay to use his email address to contact him about a certain product, concern, or event, but not allow the organization to use his phone number for them. This personally identifiable information (PII) is some of the most sensitive data that an organization stores on behalf of its customers, but it’s also some of the most valuable data, from an analytics perspective.
According to Gorbet, that level of complexity surrounding the individual permissions in a GDPR project actually jibes well with the capabilities of graph or semantic database, which is one of MarkLogic’s modalities. “Semantic is a great way to traverse that graph, to figure out if a particular entity [is entitled to something],” he says. “Performance is faster. Traceability is better. But in order to apply your policy to the data, you may want to apply it through a ontology.”
GDPR is forcing organizations to get smart about how they store PII data in order to protect the privacy of individuals. But that work also brings side benefits to the organization in the form of better organized data and a finer-grained view of their customers, too.
“Now for the first time ever, they can understand their exposure to a particular individual, which is a really big business benefit,” Gorbet says. “With a data hub they don’t have to [repeat the process]. Subsequent use cases are actually cheaper because the data is all in one place.”
The better organizations are at adhering to the new requirements under GDPR, the better organized their customer data will be, and the more analytical options will open up to them, Gorbet says.
“The smart companies are not seeing GDPR as a tax that they have to pay,” he says. “They’re seeing it as a reason to finally build that customer 360 view that they always wanted.”
September 23, 2021
- AtScale Expands Semantic Layer Solution for Microsoft Excel
- CNCF End User Technology Radar Provides Insights into DevSecOps
- At Annual OCEANS 2021, Sofar Ocean Debuts First-of-Its-Kind Maritime Open Standard, Bristlemouth
- Elastic Announces the General Availability of Elastic App Search Web Crawler, New Features for Elastic Enterprise Search
- Securonix Achieves FedRAMP In-Process Authorization
- EDJX and Cubic Corporation Partner to Launch the Internet of Military Things Edge Platform
September 22, 2021
- GigaOm Names Moogsoft an Industry Leader in “Radar for AIOps Solutions” Report
- Clearsense Acquires Plug-and-Play AI Analytics Firm
- Purdue University Global Launches Master of Science in Data Analytics
- Dihuni OptiReady CognitX Deep Learning Servers and Workstations Powered by NVIDIA Ampere Architecture-based GPUs
- Scality Awarded New U.S. Patent for Breakthrough Technology in Hyper-Scale Data Protection
- MicroAI to Bring AI Training to Renesas MCUs
- Recent Gartner VP Analyst Sanjeev Mohan Joins Okera as a Strategic Advisor
- C3 AI Reinvents Enterprise Software UX With C3 AI Data Vision
September 21, 2021
- Healthcare Analytics Summit 21 Virtual Kicks Off Today
- Tesco Selects Teradata Vantage to Drive Enterprise-Wide Analytics at Scale
- Ketch Secures $20 Million in Series A1 Funding, Accelerating its Rapid Growth
- Yandex Spins Off ClickHouse into Standalone Company
- Analytics Vidhya Announces $5.5 Million Strategic Investment from Fractal, Aims to Train Half a Million Full Stack AI Professionals
- Nutanix Cloud Platform Breaks Down Silos in Hybrid Multicloud Operations
Most Read Features
- One on One with Google Cloud Product Director Irina Farooq
- Big Data File Formats Demystified
- Tabular Seeks to Remake Cloud Data Lakes in Iceberg’s Image
- What’s the Difference Between AI, ML, Deep Learning, and Active Learning?
- Who’s Winning In the $17B AIOps and Observability Market
- SambaNova Brings Custom Silicon To Bear on High-End AI Workloads
- In Search of the Modern Data Stack
- COVID-Driven Cloud Surge Takes a Toll on the Data
- Rethinking Education in an AI-First World
- Did Rockset Just Solve Real-Time Analytics?
- More Features…
Most Read News In Brief
- LinkedIn Open Sources Tech Behind 10,000-Node Hadoop Cluster
- Data and AI Salaries Continue Upward March, O’Reilly Says
- Gartner Shuffles the Technology Deck with Latest ‘Hype Cycle’ Report
- Data Prep Still Dominates Data Scientists’ Time, Survey Finds
- Who’s Winning in Open Source Data Tech
- Can Apple Right its Privacy and Security Cart?
- Hands-Off: Manual Data Integration Tasks Plummeting, Gartner Says
- Why Is SAS Going Public?
- Apollo CEO Bullish on GraphQL’s Potential in the Enterprise
- Why Young Developers Don’t Get Knowledge Graphs
- More News In Brief…
Most Read This Just In
- TIBCO NOW 2021 Showcases Limitless Power of Data
- Cribl Raises $200M in Series C Funding on Traction with Global Enterprise Customers
- Toloka Launches Data Research Grants, Announces First Eight Recipients
- Anaconda Announces Support for Pyston, Hiring Lead Developers Kevin Modzelewski and Marius Wachtler
- MariaDB Announces SIS Provider Campus Cloud Services Migration to MariaDB SkySQL
- Transaction Processing Performance Council (TPC) Launches an Artificial Intelligence Benchmark (TPCx-AI)
- Kinetica Fuses Streaming and Contextual Analysis At Scale
- DataRobot Launches “DataRobot AI Cloud” Platform
- OneStream Previews New AI and ML Capabilities at Splash 2021
- JetBrains Launches Public Early-Access Program for JetBrains DataSpell IDE
- More This Just In…
Sponsored Partner Content
October 5 - October 7
October 12 - October 14
October 19London United Kingdom
October 27 - October 28
November 29 - December 3
December 6 - December 10San Diego CA United States