GPU-Powered Tagging Service ‘Gets’ the Big Picture
Picture sharing has blossomed on the Internet, thanks to the ubiquitous camera-phone, 4G cell service, and social media sites like Pintrest and Instagram. However, the ability of organizations to interpret and understand those publicly shared image files en masse has not kept up with the volume. Now, AlchemyAPI is giving new tools to organizations that want to incorporate image recognition into their big data analytic workflows.
Millennials are at the forefront in using pictures not only to share information, but to express how they feel and share all kinds of details of their lives (“selfie,” anyone?). A good percentage of the billion or so pictures that are uploaded to the Internet every day are from millennials and other highly sought after demographic groups, but the attempts of marketers to capitalize on this treasure trove of unstructured data has been limited.
Unveiled last week, AlchemyVision is a Web service that automatically interprets and correctly tags hundreds of millions of images a day. The service builds on other data processing services that AlchemyAPI makes available through a simple REST API, including its natural language processing (NLP)-based text analytics service used by companies like Simply Measured, the social media analytics firm that serves a good portion of the Fortune 100.
But instead of interpreting the meaning behind words, AlchemyVision finds the “meaning” of pictures. In other words, it does its best to figure out what the picture is about, and then marks them up, or tags them, with metadata that describes the image. If you feed the system a picture of a man holding a fish, or a woman next to a car, the system will correctly identify the key constituents of the picture, including the type of fish and the type of car. This metadata can be extremely useful for companies that are preforming sentiment analysis on social media data.
The system doesn’t rely on any pre-existing metadata, and takes its clues entirely from the pixel data itself, says Elliot Turner, founder and CEO of AlchemyAPI.
“We’re able to do this because our systems is continually crawling the Web and using a technology we refer to as deep neural networks,” Turner tells Datanami. “It’s actually teaching itself about different visual concepts and things as it finds them online. We didn’t tell our system to identify fish or people. It actually learned on its own through analysis of hundreds of millions of photos at scale that these are concepts that are visually important, and ultimately incorporated it into its own vernacular.”
In addition to tagging pictures, the new service also allows users to execute image-based searches against existing libraries or the Internet, similar to Google Image Search. This is useful for organizations that are looking for copyright violators or otherwise trying to clean up large libraries of pics. While Alchemy Vision shares some similarities with Google Image Search, there are differences. For starters, it’s more accurate, Turner claims. It’s also available for private use via an API, he says.
The new service runs on the Colorado company’s own neural network, which runs on a GPU-based computer cluster. The NVidia-powered cluster was programmed using the CUDA parallel computing model, machine learning algorithms, and AlchemyAPI’s own proprietary intellectual property.
“We initially started using CPUs for some of our neural network work, but to build system that can scale up to understanding arbitrary photos of arbitrary real-world 3D objects, we had to run simulations at a scale that just was not practical to normal processing cores,” Turner says. “So we switched to using NVidia cards with thousand of cores per card so we can run simulations that have tens of millions of neurons and many more synaptic connections between the neurons.”
The system is currently able to process millions of images per hour, or up to about 200 million images per day at the current scale, Turner says. Over time, as demand picks up over the course of the next few months, more GPUs will be added to the cluster. “As we see an uptick over the course of the next few months, we’ll be rapidly expanding the core capacity of our server grid to keep up with the expected growth that we think we’ll likely see from our customers who are involved in these high volume content analysis applications,” Turner says.
The new offering is already being used by Simply Measured. Aviel Ginzburg, CPO and co-founder of the company, says the image-tagging service is giving it the campaign-tracking tools its customers demand. “With AlchemyVision, we have been able to accurately tag and classify a good portion of images at very high rates with minimal human effort,” Ginzburg says.
AlchemyAPI is effectively renting out this image-tagging machine, and it’s not particularly expensive. Users can get up to 1,000 transactions per day (either AlchemyVision or AlchemyText) for free, while the Small Business package includes 5,000 transactions per day for $250 per month. Customers who are running 200 million transactions through the system will be asked to pay considerably more, of course.
Turner, who previously designed intrusion detection systems for network security device manufactures, revels in the possibilities surrounding unstructured data and the hidden potentials they hold.
“I helped invent network intrusion detection that involved tearing apart packets looking for statistical anomalies and other signs of malfeasance,” he says. “After that company was acquired 13 years ago, I wanted to solve the unstructured data problem that I thought was so much more prevalent than the security problem.”
The 20-person team working at AlchemyAPI is a diverse group, ranging from machine learning experts to a former CERN physicist. You may not have heard about AlchemyAPI. But the small company has some other big developments up its sleeve that you may want to keep an eye on.