Follow Datanami:
August 18, 2015

Security Survey: Huge Datasets Exposed Online

An estimated 1.1 petabytes of data related to a handful of databases, search engines and other caching technologies is exposed online, a data security startup reports. A key security flaw centers on default settings that lack basic configurations for authentication, encryption, authorization or other security controls.

BinaryEdge, a security services startup based on Zurich, Switzerland, said its Internet data exposure survey looked at four representative technologies the company regularly uses: MongoDB; the key-value cache and store technology Redis; Memcached, the distributed memory cache system; and ElasticSearch, a “full-text” search engine.

Given its ability to scale, the MongoDB database was found to have the highest data exposure: nearly 620 terabytes. “We found 39,134 MongoDB Servers instances that answered our requests and that didn’t have any type of authentication,” BinaryEdge reported in a blog post this week. It also found 7,267 instances that did have some kind of authentication enabled.

“It is worrisome the amount of MongoDB Servers that are exposed,” the security analyst noted.

Responding to the survey results, Kelly Stirman, MongoDB’s vice president of strategy, noted in an e-mail that “the potential issue is a result of how a user might configure their deployment of MongoDB without security enabled. There is no security issue with MongoDB – extensive security capabilities are included with MongoDB.” The database vendor’s security best practices can be found here.

Meanwhile, the survey found more than 35,000 instances on Redis that answered requests and lacked any kind of authentication. The Redis default configuration “doesn’t set any type of authentication and listens on all network interfaces as stated on the configuration file,” BinaryEdge reported.

It estimated the amount of exposed data on Redis based on the current quantity of data available to access and “peak memory,” or the largest amount of data exposed at a given time. BinaryEdge found more than 17 terabytes of exposed data at peak memory.

The company noted that Redis is designed for accessed by trusted clients inside trusted environments. Hence, “it is not a good idea to expose the Redis instance directly to the Internet.”

Memcached is frequently used to accelerate dynamic, database-driven websites by caching data and objects in RAM. The technique reduces the number of times a database, API or other external database must be read. The security survey found more than 118,000 instances of Memcached instances online. Exposed data totaled more than 11.3 terabytes, BinaryEdge reported.

ElasticSearch had the second highest amount of exposed data, according to the security survey, or more than 531 terabytes. Since the search engine is CPU intensive, the security analyst reported that the vast majority of exposed data was crunched using Intel processors.

Based on the huge amounts of data exposed to the Internet, BinaryEdge warned that the installed versions of the technologies it probed “are quite often old and not updated, which means that, in some cases, not only is data exposed but even servers can be compromised.”

It noted that installations were poorly configured in small and Fortune 500 companies alike. “Some of these technologies are used as cache servers, so its data is always changing and a multitude of client [or] company data can be looked at,” BinaryEdge reported.

Disturbingly, the analyst said at least some of its exposed data totals might be even higher since some IP addresses are “blacklisted” if a company asks not be scanned in a security survey.

Recent items:

Crypto Tools Target Hadoop Security Gaps

Will Poor Security Handicap Hadoop?

Datanami