Get Your Big Data Analytic Competition On
Are you interested in proving your big data analytic worth to a doubtful world? There may be no better way than participating in an officially sanctioned analytic competition, which will let you see how your skills stack up in pursuit of big data glory and cash prizes.
AlgoMost is putting $15,000 on the table in a competition to see who can most accurately predict which publicly traded companies will get acquired through the end of the year. The Russian data mining company is providing 24 parameters of financial data for about 3,500 publicly traded companies, and it’s up to the participants to come up with the predictive algorithms that can most effectively tease the M&A signal out of all the noise.
The competition runs through the end of August, at which point AlgoMost will dole out $5,000 to the data scientists with the most subjectively elegant algorithms, as determined by a body of experts. The rest of the prize money will be handed out in early 2015, when all the objective data regarding M&A activity for 2014 is finally in.
The pursuit of riches routinely attracts the best and brightest analytic minds to Wall Street, which is arguably the biggest patron and consumer of big data analytics. An algorithm that could correctly predict which companies will get bought or sold based on publicly available data would be very valuable to its owner.
Previous attempts to use predictive analytics to foretell M&A activity failed, the company says, largely because they were limited to the financial data from just a handful of companies. The hope is that widening the number of companies and their associated data will provide a richer test bed for a winning algorithm to emerge. For more information on the competition, which is being funded by an unnamed client of AlgoMost, see http://algomost.com/en/tasks/predict-acquisition.
Next month marks the start of Texata‘s Big Data World Championship, which is sponsored by this publication and backed by the likes of Amazon, Palantir Technologies, and Thompson Reuters. Hundreds are expected to compete in this competition, which will pit data scientists against each other in the areas of statistics, machine learning, programming, and visualization. Round 1 of the Big Data World Championships kicks off August 30 (or August 31, depending on your timezone). For more info or to register, check out http://www.texata.com.
Data scientists can also match their wits against others at TopCoder, which runs competitions in a variety of fields, including data science. Nearly 300 teams are currently going at it with two data science competitions. Later this year, the TopCoder finals will take place in San Francisco, where $300,000 will be doled out to the winners.
No story on data analytic competitions is complete without a mention of Kaggle, the Tough Mudder of analytic competitions. More than $800,000 in prize money is currently at stake across 20-plus competitions, involving everything from using machine learning to identify the Higgs Boson particle and finding better ways to detect seizures, to applying sentiment analysis to movie reviews, and of course “Random Acts of Pizza.”
Kaggle has run successful analytic competitions for numerous Fortune 100 companies, and sports a community with 180,000 members, including some of the world’s top data scientists. In addition to giving established and emerging data scientists a chance to win tens of thousands of dollars, Kaggle competitions help client companies refine their predictive models and algorithms to provide real-world analytic solutions for tough problems, such as improving the accuracy of estimates of customer claims costs (Allstate), helping detect driver drowsiness (Ford), delivering more accurate airline departure and arrival times (GE), delivering better predictability of drug targets (Merck), and improving lifetime customer value (Tesco).
Yesterday the credit scoring company FICO announced it will be teaming up with Kaggle to run a series of data science competitions. The contests are slated to begin later this summer, and will involve the FICO Analytic Cloud, which the company launched last week.
FICO is looking forward to this “crowdsourced innovation” at Kaggle, which Doug Clare, vice president for cloud analytics for FICO, says is an outgrowth of “the democratization of analytics.” “Together with Kaggle,” Clare says, “we’re eager to see what happens when some of the world’s greatest minds get their hands on some of the world’s most powerful analytic tools and applications in the FICO Analytic Cloud.”
While FICO has its feet firmly grounded in the world of determining individual’s credit worthiness, the company is launching a new line of cloud-based predictive solutions that use some of the same tools it uses for credit scoring. The FICO Analytic Cloud is built on Apache Hadoop and includes a variety of add-ons for manipulating data, including the language R, PMML (an XML-based language for predictive modeling), the Apache Lucene search engine, and Apache Tika, a content analysis toolkit.