Machine Vision Used to Wrangle Image Explosion
We live in a world besotted with images: selfies, Instagram photos, video clips running on social media platforms that make it easy to instantly post images captured by smartphones and other devices.
The explosion of images is making keyword searches less effective, prompting companies that sell photos and music to come up with new approaches to sorting through the haystack for the relevant image. With that in mind, image and music licensor Shutterstock Inc. (NYSE: SSTK) has launched new search features based on machine learning techniques and, specifically, its proprietary “convolutional” neural network technology.
Convolutional neural networks are a machine-learning construct usually comprised of multiple layers that is often followed by more fully connected layers as in a standard multilayer neural network. Convolution nets are made up of neurons with “learnable” weights and biases.
Shutterstock said the first application of neural net technology is a “reverse image search” as an alternative to standard keyword searches. Once a photo is uploaded from the company’s collection or from another source, an algorithm scans and provides images similar in “look and feel” to the original.
The image and video search technique leverages computer vision technology that breaks images down into their constituent elements. Shutterstock said its approach relies on pixel data within images rather than metadata collected via keywords and tagging. That approach, it asserts, provides more granular searches that yield relevant content.
Keyword data, which Shutterstock uses to index images into categories on its site, isn’t “nearly as effective for surfacing the best and most relevant content,” Kevin Lester, Shutterstock’s vice president of engineering,” noted in a blog post. The image licensor’s computer vision team applied machine-learning techniques to rebuild that process.
Relying instead on pixel data, they broke down millions of images and video clips “into their principal features, and [the machine-learning approach] recognizes what’s inside each and every image, including shapes, colors and the smallest of details,” Lester added. “This visual and conceptual data is represented numerically.”
The company’s computer vision team built and tested the convolutional neural network over the past year. The image licensor manages a portfolio of more than 70 million “creative assets,” and the new search engine is designed to collect all data on each image, every download and every image search.
The image licensor based in New York City has more than 100,000 contributors and adds hundreds of thousands of images to its collection each week. It video vault includes about 4 million video clips.
Shutterstock launched the reverse image and “visually similar” search capabilities for images in March. It expects to launch a version for video searches soon.
Machine learning and vision are slowly finding new applications as the sheer volume of imagery continues to soar. For example, a Swedish mapping startup recently rolled out a proprietary method of stitching together crowd-sourced photos of street scenes. Once photos are uploaded, an app combines street-level scenes with other photos to create a 3-D map of a route.