Too many big data initiatives are science projects that take months of effort, risk failure and require highly trained data scientists with scarce skills. According to a CSC survey, 55 percent of big data projects aren’t completed and many others fall short of their objectives.Read more...
GraphLabs Wises Up Machine Learning Platform
GraphLabs Inc., a machine learning startup, rolled out a new software platform this week it says will bring expanded machine-learning capabilities to enterprises struggling to find data scientists.
The GraphLab Create 1.0 platform is said to automate key aspects of advanced big data analytics so a diverse range of companies can use machine learning to squeeze more business intelligence out of operational and other data.
Seattle-based GraphLabs also claims its Create platform is the first to apply advanced machine learning on a single platform for most data sets that include graph, tabular and text.
The company said its platform allows developers to prototype in a single machine, then move to production with the same code. That, along with the GraphLab engine, is said to reduce by weeks or months the time needed to move to production since time-consuming re-coding of the prototype for use in production is reduced.
While machine learning is widely seen as a promising solution for big data analytics, developers have found it difficult to use research prototypes of these smart algorithms. The reason is they have proven difficult to scale for growing volumes of data. The prototypes have also proven “difficult to use in practice,” Johnnie Konstantas, vice president of marketing at GraphLabs, noted in a blog post.
Konstantas said GraphLabs introduced the Creator platform as a way to enable large-scale machine learning. Distributed machine learning enables faster analysis of large graphs as well as tasks such as statistical inference, clustering and finding influential nodes, he added.
The company also said GraphLab is being integrated with Hadoop to make it easier for GraphLab customers to use Hadoop Distributed File System data sources. GraphLab is now being made available as part of the Hadoop distribution platform, Pivotal HD.
GraphLab is also certified for Cloudera’s distribution of Apache Hadoop, the company said.
GraphLab CEO Carlos Guestrin noted in a statement the growing need to provide data scientists with access to an integrated environment for transforming raw data into business insights. Guestrin said the Create platform seeks to commercialize machine learning as a way to boost the productivity of data scientists.
GraphLabs was launched by Carnegie Mellon University in 2009 as an open project under Guestrin. Its software was initially intended for applying machine learning to graph analysis. Since then, functionality has been expanded to include tables and text, and is now used to deliver predictive analytics for service providers and Fortune 500 companies.
The company said GraphLab Creator 1.0 would be available on July 21, 2014. The release coincides with a company-sponsored conference in San Francisco for data scientists, software engineers and data analytics researchers. Among the conference sponsors are Adobe, Google, Oracle and Rackspace.