Follow Datanami:
July 17, 2013

Cloudera Snaps Up Mahout Distro, Myrrix

Isaac Lopez

Cloudera announced its intentions of in expanding its domain in machine learning this week by acquiring London start-up, Myrrix and its human assets. In a blog post on the Cloudera website, Myrrix founder, Sean Owen said that he would be joining Cloudera as the Director of Data Science, with Myrrix technology being brought over to benefit Cloudera’s CDH.

Sean Owen, Myrrix Founder

Myrrix is described as a real-time, scalable clustering and recommender system, which evolved from Apache Mahout. As a contributor of Mahout, Owens is credited for developing the clustering classification functionality, a popular feature within the Mahout library. While some of the Myrrix technology is sure to find its way into Cloudera’s CDH offering, Cloudera says that the real acquisition is of Owen, himself, classifying the move as an “acqui-hire.”

The move tracks the trend of companies wanting to bring more innovation to the user experience as the focus in the big data space moves up the stack towards applications. As noted recently, algorithm wars are heating up as companies in the big data space are popping up with their own secret sauce for enriching applications. This week, predictive analytics startup Ayasdi (a Cloudera partner) announced that they have raised $30.6 million in venture funding.

Owen says he’s witnessed renewed interest in machine learning (what he calls “Big Learning”) take off as technologies like Hadoop have made large scale data crunching more accessible. “Hadoop and cheap hardware have made big data analysis so much more feasible,” explained Owen. “With cheap disks and CPUs, and mature open-source databases and computation frameworks, startups and even individuals can afford to run terribly complex computations over terabytes.”

Owens says that as a member of Cloudera, he will continue to work on the integration of machine learning to fit the parallel Hadoop world. “There is still so much to be done from these beginnings before learning on Hadoop is as accessible as it can be,” said Owen, adding that Cloudera has shown its capabilities in taking complicated source code and turning it into a slick package that is highly accessible to a broader audience. “The same will happen for applications like Big Learning – that’s always been the Myrrix vision too, and now we’re working together within Cloudera to start building this out for…the bigger audience.”

Owens says that work is beginning on how to incorporate the Myrrix technology into Cloudera’s CDH “in just the right way.”

Related Items:

On Algorithm Wars and Predictive Apps

Data Athletes and Performance Enhancing Algorithms 

Rometty: The Third Era and How to Win the Future 

Datanami