Follow Datanami:
February 23, 2012

Startup Claims One Giant Leap for Machinekind

Robert Gelber

A San Jose analytics startup claims that it can achieve speeds up to ten thousand times faster than existing approaches in the big data marketplace.

Skytree is an analytics vendor that advertises the ability to perform machine learning along with other advanced analytics methods on big data. Their Skytree Server is touted as the first extreme speed and scale analytics processing engine for big data and the only enterprise-grade, high performance machine learning/advanced analytics solution that supports big data requirements.

Machine learning is the science of pattern discovery and predictive analysis from massive datasets. Skytree makes the distinction that technology is typically told what to do rather than it learning what it should do. Think of the differences between a supercomputer and IBM’s Watson.

The supercomputer will process data according to your rules, while Watson will continue to analyze and learn from outcomes until it finally arrives at the best answer. Martin Hack, co-founder and CEO of Skytree says that, “Today’s SQL-based approach of analyzing data using handcrafted rules is essentially a human-based approach. Automatically discovering the rich and subtle patterns in data, letting the data speak for itself, is going to be a game changer going forward.”

The software also includes the following machine learning algorithms:

  • Recommender systems – provide profile-­‐based targeted recommendations (e.g., products)
  • Anomaly/outlier identification – finding unusual or ‘special case’ data records in big data sets
  • Predictive analytics – making predictions based on similar historic data
  • Clustering and market segmentation – finding natural groups within data
  • Similarity search – find the closest existing data matching a record of interest

The company mentions they are currently working on another set of machine learning algorithms, which will be “coming up soon”.

Skytree makes the claim that new algorithms coupled with system design enables their offering to function up to tens of thousands times faster than traditional solutions. Test results on their whitepaper, display head to head comparisons of their offering vs. R and WEKA. Administered using Amazon’s EC2 cloud and Sloan Digital Sky Survey (SDSS) public data. Skytree was faster in the tests displayed, including an all neighbors query in which it completed indexing in 4.2 seconds compared to 2,272 seconds for WEKA and more than 72,000 seconds for R.

The startup doesn’t seem to run light on brainpower. CEO Martin Hack is the former head of the Trusted Solaris OS and CTO Alexander Gray is a professor at Georgia Tech where he heads FASTlab, an academic lab for scalable machine learning. Along with the co-founders, the company has a technical advisory board, all members of the National Academy of Engineering. Skytree will be presenting at the Strata Conference on March 1st

The introduction of machine learning into research and enterprise analytics could prove to be a big step for the industry.

Datanami