Follow Datanami:
June 7, 2013

Big Data Big Five

In this week’s Big Data Big Five, Facebook reveals its home rolled answer to SQL on Hadoop headed for open source later this year, IBM and 10gen team up for data-intensive mobile apps, Risk analysis company BitSight grabs $24 million, and more…

Presto: Facebook Reveals its Answer to SQL on Hadoop

At a developer conference this week, Facebook revealed an ambitious new project aimed at querying data at the Exabyte scale. Dubbed “Presto,” the new technology is said to be built from the ground up to handle Facebook scale workloads, handling 250PB of data in the Facebook data warehouse (which spans thousands of machines across the globe).

Presto gets rid of some of the failings of Hive, according to a report in The Register, and has reportedly demonstrated a four-to-seven times improvement over the notable data warehouse querying tool in CPU efficiency, and eight to 10 times faster than Hive in returning query results. 

According to Facebook engineer, Martin Traverso, the SQL-like tool is capable of starting all the query stages at once, and can stream all the data through the stages – something that MapReduce is not equipped to do.

According to the report, Presto is now being used by 850 internal users each day at Facebook, and is performing 27,000 queries in a data universe of 320TB of data.

| — NEXT — IBM taps 10gen for Mongo Mobile Collaboration — >

IBM taps 10gen for Mongo Mobile Collaboration

Database company, 10gen, announced this week that they have entered into an agreement with IBM to collaborate on a new standard to enable businesses to create sophisticated, data-intensive apps using skills and tools standardly available.

The partnership, which is said to further extend IBM’s “MobileFirst” strategy aimed at enabling mobile apps for the enterprise, will reportedly aim to combine data managed by IBM DB2 systems with mobile apps designed using MongoDB.

MongoDB is an open source NoSQL database which stores structured data as JSON-like documents, enabling high performance dynamic queries on large datasets.

According to company officials at IBM, the collaboration compliments the company’s recent announcement of the intent to acquire cloud computing infrastructure provider, SoftLayer Technologies.

| — NEXT — NICTA Launches $12M Natural Sciences Project — >

NICTA Launches $12M Natural Sciences Project

Australian information and communications technology group, National ICT Australia (NICTA), announced this week that they have secured funding to launch a $12 million dollar big data and machine learning project aimed at enabling discovery in the natural sciences.

The project aims at using data on global biodiversity and ecological processes in order to determine the interactions which are most important for producing the existing environment. Researchers expect that through this effort they will be able to “open a window on some of the mysteries of biodiversity and showing how ecosystems will be affected by climate change and other factors.”

The project will combine publicly available geological data to build a historical model of what Australia looked like 1.5 billion years ago. According to NICTA, the project will also aim to discover new data analysis processes that can reduce the amount of raw data needed to conduct successful experiments and potentially increase the rate of discovery.

The project is being funded to the tune of $4 million from the Science and Industry Endowment Fund, and $8 million from research collaborators over its three year life span.

| — NEXT — Dell Launches New Storage Offering — >

Dell Launches New Storage Offering

Dell has announced that they have introduced a new offering for flash-optimized storage that they claim drives down the cost of flash solutions by up to 75%. Aimed at big data deployments, Dell unveiled the new addition to their storage platform saying that the solutions will help customers efficiently manage vastly growing data while also addressing the needs of I/O-intensive workloads.

The offering includes the 6.4 version of Dell’s Compellent Storage Center array software, which they say is optimized for data intensive workloads, and will tie together their Compellent Flash Optimized Solution, Dell’s Compellent high-density SC280 storage rack, and their file system, FluidFS.

Dell says that their solution will provide automated tiering for flash, while reducing the footprint, packing more storage into a smaller space.

| — NEXT — BitSight Grabs $24 Million in Series A Round — >

BitSight Grabs $24 Million in Series A Round

Start-up risk analytics company, BitSight Technologies, has announced that it has secured a series A round of funding to the tune of $24 million dollars.

The company says they will provide a platform that will collect and analyze organizational data for the purposes of providing data-backed risk analysis for managing information security risk. The technology is said to combine big data analytics with evidence of outcomes, providing risk managers with the ability to know where the threats are and how to respond to mitigate an attack.

Currently vaporware, the company says that the new round of financing will be used to expand their product development and launch the technology.

The funding round was backed by Globespan Capital Partners, Menlo Ventures, Flybridge Capital Partners, and Commonwealth Capital Ventures.

BitSight’s original seed funding came from the National Science Foundation.

| — Return to Datanami Front Page — >

 

 

Datanami