Follow Datanami:
September 12, 2013

Apache Mahout Streamlined and Notched Up with 0.8 Release

Isaac Lopez

The open source machine learning project, Apache Mahout took another step towards a 1.0 release this summer, as the open source pattern cruncher cut fat and muscled-up with the latest 0.8 release. The new release shed some of its lesser used weight, while adding some lean mass with a new and improved recommendation algorithm, as well as a speedy new clustering one.

The recent release added some new functions as well as a slew of bug fixes, explained contributor, Ted Dunning in a recent article. “One of the big changes has to do with clustering,” he wrote. “Clustering is a form of machine learning that helps find patters in big data. With the release of 0.8, Mahout has a new super-fast k-means clustering algorithm that I find is attracting a lot of attention.”

Also in the 0.8 release, Dunning says that Mahout now offers better performance through extensive improvements to vectors, matrices in the Math Library, and recommender implementation.

Further expanding Mahout’s usefulness, the release also contained a variety of integrative code making Mahout friendly with popular packages such as Apache Hadoop, Apache Lucene, Apache HBase, Apache Cassandra and more.

The group behind Mahout says that there are still parts to clean up as they move towards a 1.0 release, pointing at some of the under-supported or underperforming aspects of the machine learning framework. The group says the removal of these will help them “to better focus the energy and contributions on key algorithms that are proven to scale in production and have seen wide-spread adoption.”

According to the group, the 1.0 release isn’t actually that far off. “Our plans as a community are to focus 0.9 on cleanup of bugs and the removal of the code mentioned above and then to follow with a 1.0 release soon thereafter,” wrote contributor, Grant Ingersoll in the release notes.

Here’s hoping there’s enough pizza and beer to help these guys power through and make it happen.

Related items:

Cloudera Snaps Up Mahout Distro, Myrrix 

Data Science Back to School 

Cloudera Search 1.0: Like Googling Hadoop