Crowdsourcing Used to Augment Machine Learning
As unstructured data continues to pile up, different approaches and platforms are emerging to help businesses make greater analytical use of what otherwise might amount to clutter.
Among the latest approaches is an “intelligent crowdsourcing platform” from Spare5, a Seattle-based startup specializing in, among other things, data “cleanup.” Spare5’s proprietary platform unveiled Wednesday (April 13) incorporates a crowdsourcing approach that leverages the experience of domain specialists to perform “custom micro-tasks.”
Once “filtered for quality,” the resulting tasks can be used for applications ranging from training artificial intelligence models and improving searches to augmenting directories.
The crowdsourcing platform combines human insights with machine learning techniques to untangle and promote wider analytics use of unstructured data. The requirement for new tools needed to massage unstructured data is growing, with one industry estimate forecasting that images, video, social media content and text message would account for as much as 93 percent of all data by 2022. Unstructured data is currently growing at a 62-percent annual clip, according to market watcher IDG.
Spare5 claims its platform “leverages a secure network of qualified individuals, and the ability to engage the right human in the right loop to deliver the best insights into unstructured data.”
The company said it used crowdsourcing techniques like individual human expertise, specific interests, even the free time to participate, to build a library of “game-like” tasks. Based on individual skills, interests and demographics, the approach ensured that the tasks were matched to the right individual.
To improve accuracy, the company’s “Reputation Engine” applied machine-learning techniques to rate each individual’s performance by domain. Proprietary machine learning algorithms are then used to filter task results for accuracy. “As customers use the platform over time, the process becomes faster, smarter and better,” the company claims.
The resulting combination of human insights and machine learning can then be used to organize unstructured data into “clean,” labeled data. Spare5 asserted limitations in current data quality tools leave much unstructured data unused.
The crowdsourcing platform also includes APIs and software development kits designed to help integrate the resulting data into existing workflows or export it to business intelligence reports.
Available now on a subscription basis, Spare5 said Expedia, Getty Images, GoPro, Sentient Technologies and other customers are already using its crowdsourcing platform. Sentient said it has incorporated the Spare5 platform into its AI-power “shopping assistant.” The company uses the tool to validate its AI-generated models by comparing how shoppers perceive differences in retail products.
Getty Images, which maintains an archive of over 80 million images, uses the Spare5 platform to apply human insights to its massive photo collection while scaling its current image search capabilities.
Spare5 is betting its application of “the right human in the right loop” will help boost the utility of machine learning in making sense of unstructured data.