Big Data Security: Progress Is Made, But Is It Enough?
It’s taken a few years and more than a few major data breaches, but it would appear the tide is finally beginning to turn when it comes to awareness of the importance of protecting data stored in analytical and transactional systems alike. But don’t let your guard down yet, as major security threats—such as gaps in the Hadoop security stack, ransomware, and corporate bureaucracy—continue to threaten data sanctity.
It wasn’t that long ago that businesses didn’t even bother thinking about customer trust. Indeed, PricewaterhouseCoopers didn’t even track trust 20 years ago. But a rash of accounting frauds (Enron, Worldcom, et al) in the early 2000s popped that bubble, and before long, hackers were weaseling their way into unsecured corporate databases to pilfer customer records on a regular basis.
In 2013, only 37% of CEOs worried that lack of trust in business would harm their company’s growth in 2013, according to PwC’s 2016 CEO Global Survey. By 2016, that number had jumped to 58%. And today, nearly nine out of 10 CEOs in the US are somewhat or extremely concerned about cyber threats, according to the report.
It’s clear that data security has finally landed on the corporate radar with a loud, sickening smack. But just because one is aware of the threats doesn’t make them go away. In fact, the security situation may be getting worse thanks to the adoption of distributed big data platforms, like Hadoop.
Holey Hadoop, Batman!
While Hadoop continues to be the focus of many organizations big data analytic and IoT strategies, the distributed storage and computing platforms also continues to bear bad news for those who value the security of data.
“The continuing growth of Hadoop as a platform for data analysis and, increasingly, for more operational data processing uses has created data security issues that are not being addressed,” Gartner analyst Merv Adrian wrote in a December research report titled Rethink and Extend Data Security Policies to Include Hadoop.
The report took the three major commercial Hadoop distributors to task for creating “three distinct competing stacks of security software.” What’s more, each of the stacks is immature, is not comprehensive, and appears destined to promote incompatibility and vendor lock-in, Adrian’s report says.
“Unlike DBMSs,” the respected analyst says, “Hadoop software stacks have not had built-in security capabilities and, because they increase utilization of file system-based data that is not otherwise protected, new vulnerabilities can emerge that compromise carefully crafted data security regimes.”
What’s more, because of the nature of Hadoop data lakes—where raw unstructured and semi-structured data of unknown quality is written, and only structured when it’s read (or schema-on-read)—it raises other risks.
“Unlike DBMSs, which are typically used to store known data that conforms to predetermined policies about quality, ownership and standards, Hadoop creates the possibility of presenting users with ‘dark data,'” Adrian writes.
This gap in Hadoop security protections has created room for third-party vendors to operate and exploit. Informatica, which is attempting to pivot its long-standing dominance in ETL for data warehouses into the new distributed Hadoop world, is one of the vendors looking to make a splash.
“Despite all the time, effort and billions, maybe trillions, of dollars spent, security is not working,” says Amit Walia, executive vice president and chief product officer for Informatica. “Security breaches are still on the rise because most organizations are taking the wrong approach; they are focused on securing the end-points.”
Informatica says it has a better approach with a new release of its existing security product, [email protected] Unveiled Wednesday, the Redwood City, California company says the software combines several important capabilities needed to protect data as it sits in Hadoop and legacy environments, including automated discovery of sensitive data, proliferation analysis, anomalous user activity detection, multi-factor data risk analytics, and automated orchestration of remediation.
Another vendor angling for a piece of the emerging Hadoop security pie, as defined by Gartner, is Dataguise.
“Gartner has again nailed the importance of broadening one’s perspective when it comes to Hadoop,” says JT Sison, VP of marketing and business development for the Fremont, California company. “As mentioned in the report, there are many threats to consider regardless of the data framework selected so it will be necessary for organizations to orchestrate a Hadoop security stack.”
Dataguise says its DGProtect offering can help companies detect, audit, protect, and monitor their sensitive data assets residing in Hadoop, in the cloud, and other repositories, such as NoSQL databases like Apache Cassandra, Teradata warehouses, and even Microsoft SharePoint.
Meanwhile, a new report from Intel finds the corporate bureaucracy also poses a risk to effective remediation of the security threat posed by cybercriminals.
“…[C]ybercriminals have the advantage, thanks to the incentives for cybercrime creating a big business in a fluid and dynamic marketplace,” Intel’s McAfee subsidiary writes in the report, titled Titling the Playing Field: How Misaligned Incentives Work Against Cybersecurity. “Defenders on the other hand, often operate in bureaucratic hierarchies, making them hard-pressed to keep up.”
About 75% of the 800 people Intel surveyed for the study ranked cybersecurity as the biggest risk to their organizations, followed by regulatory or liability risk and reputational risk. Despite that awareness, there are still some firms that don’t “get it.”
“It is clear that too many firms do not believe that the dangers of a breach will severely affect them.” Intel quotes Lloyd’s of London CEO Inga Beale as saying in the report.
Frank Blake, the CEO of Home Depot, which was the victim of a hack that netted data on 50 million cardholders in 2014, admits to being one of those. “There are assumptions we made and assessments of the nature of the threat that in retrospect weren’t sufficient,” he says in Intel’s report.
The basics of security haven’t changed. Having multiple rings of security and a data-centric security policy are still a good start for getting ahead of the security eight ball. Sensitive data should always been encrypted or masked on a server behind a firewall. Multi-factor authentication should be used to minimize access by unauthorized personnel. Use of powerful user profiles with root access should be minimized. Audits should be performed on a regular basis to detect unsecured end points.
These are some of the minimum standards of good security hygiene that should be enforced, but it may not be enough to counter emerging threats.
Emerging threats–like the current ransomware epidemic that uses a combination of clever social engineering and malware to encrypt victim’s hard drives–could thwart even those who check most of the other boxes on that security checklist.
A surge in ransomware attacks targeting unsecured MongoDB and Elastic servers earlier this year now threatens to spill over in the Hadoop space. According to this January story on the threat geek blog, the SANS Internet Storm Center recorded a surge in port 50070 scans, as cybercriminals looked for open HDFS installations on the Internet.
While properly configuring the Kerberos user authentication system could eliminate these openings, it appears few Hadoop users are taking those steps, according to the folks at security firm BlueTalon, which sells a shrink-wrapped product called BlueTalon SecureAccess for WebHDFS that takes Kerberos’ place in the stack.
“One characteristic of Hadoop is that you can choose between multiple components for storage, compute, and data access,” writes Pratik Verma, founder and chief product officer of BlueTalon. “This architecture provides unprecedented scale and performance but amplifies security risks.”
The Russians Aren’t Coming–They’re Already Here
The situation in Russia also portends continued escalation of the cybercriminal threat. A few eye-opening passages in the Intel report indicate that the West is ill-equipped to deal with the lawless bear, which condones attacks on American and European institutions and refuses to cooperate in international cybercriminals investigations.
“The Russian-speaking hacker community is highly fluid, is home to many of the world’s most sophisticated cybercriminals, and involves a higher degree of overlap between the legitimate ICT [information and communications technology] and cybersecurity industries and the criminal ecosystem,” the report states. In fact, some Russian hackers openly state their black-hat credentials on their Facebook pages, right next to their white hat credentials, the report says.
“One expert we talked to described Russia’s approach to hacking as similar to their approach to Olympic athletes,” Intel’s report continues. “The government invests significant resources in developing an elite athlete who competes for the nation, then they go independent and make millions playing for themselves.”
Thanks to a Russian economy that offers little room for advancement, it would appear that many college grads are taking their math and computer science degrees and accepting master’s level training in hacking by Russian intelligence services. This foretells even more sophisticated cybercriminal attacks on Western institutions in the future.