Security Concerns Extend to ‘Big Data Life Cycle’
The “datafication” of modern life is now measured in the quintillions of bytes of data generated by humans each day. The struggle to store, manage and retrieve the waterfall of data has also highlighted a growing list of privacy and security issues.
Privacy concerns stemming from the rise of big data and the resulting backlash against abuses like unscrupulous data brokers have resulted in a range of regulatory initiatives designed to safeguard sensitive information. Now, greater attention is being given to the security challenges raised by big data along with security holes in open-source frameworks.
A whitepaper released by MIT Technology Review in collaboration with Oracle Corp. notes the growing number of “threat vectors” that expose big data to misuse. The whitepaper, Securing the Big Data Life Cycle, cautions that “organizations are exposing their sensitive information to increased risk as they integrate open-source Hadoop into their IT environments. For that reason, companies serious about using big data effectively need to make sure they’re doing so securely, protecting their valuable information and securing private data so that it stays private.”
The security flaws in Hadoop are well known. Apache Hadoop was an open source development project with little initial regard for security. As Hadoop’s security problems emerged, distributors and the Apache community began offering security add-ons for access control and authentication (Apache Knox), authorization (Apache Sentry), encryption (Cloudera’s Project Rhino) along with security policy management and user monitoring (the proposed Apache Argus based on Hortonworks‘ XA Secure acquisition).
“Hadoop itself is very weak in security. You can be a Linux user and take all the data from Hadoop,” Manmeet Singh, co-founder and CEO of Dataguise, a provider of data masking and encryption tools for Hadoop, told Datanami last November. “The problem is the insider threat. Anybody can walk away with billions of credit card numbers.”
The Oracle-sponsored white paper argues that the process of bullet proofing big data infrastructure requires a growing list of security controls, including:
- Authentication and authorization of users, applications and databases;
- Privileged user access and administration;
- Data encryption at rest and in motion;
- Data redaction and masking for both production and non- production environments;
- Separation of responsibilities and roles
- Transport and application programming interface security and, finally;
- Monitoring, auditing, alerting and reporting.
At the same time, agencies like the U.S. Federal Trade Commission are seeking greater authority from Congress to, for example, impose civil penalties against companies that fail to maintain “reasonable security.” Hence, warns Neil Mendelson, Oracle’s vice president for big data and advanced analytics, if companies “fail to secure the life cycles of their big data environments, they may face regulatory consequences, in addition to the significant brand damage that data breaches can cause.”
Hence, Oracle is pitching its approach to big data security that it claims can remove barriers between Hadoop, NoSQL and relational databases either on-premises or in the cloud.