Follow Datanami:
October 3, 2016

Yahoo Shares Algorithm for Identifying ‘NSFW’ Images

Yahoo is releasing the deep learning algorithm that it uses to detect “not safe for work” (NSFW) images to the open source community, the Web giant announced last week.

Anywhere from 4% to 30% of the Internet is composed of pornographic content, according to a 2011 article in Forbes. To protect the Web-viewing public from accidentally stumbling across this inappropriate NSFW content, Website and search engine operators have invested large sums to build proprietary imagine classification systems that use deep learning algorithms to automatically detect pornographic images and videos.

Thanks to the work of Yahoo, Website and mobile application owners will no longer have to build their own systems, or buy expensive pre-made systems.  That’s because Yahoo is making its Caffe-based deep neural network model available for anybody to download on Github.

“To the best of our knowledge, there is no open source model or algorithm for identifying NSFW images,” Yahoo engineers Jay Mahadeokar and Gerry Pesavento write in a blog post last week. “In the spirit of collaboration and with the hope of advancing this endeavor, we are releasing our deep learning model that will allow developers to experiment with a classifier for NSFW detection, and provide feedback to us on ways to improve the classifier.”

yahoo_nsfw

Yahoo’s NSFW algorithm scores images from 0 to 1.

According to Mahadeokar and Pesavento, the convolutional neural network (CNN) system is based on the Caffe deep learning library, as well as CaffeOnSpark, which they say is “a powerful open source framework for distributed learning that brings Caffe deep learning to Hadoop and Spark clusters for training models.”

Upon loading an image into Yahoo’s CNN, the system will output a probability score (from 0-1) that can be used to detect and filter NSFW images. Developers either use this score to filter images or to rank images in search results, the engineers say.

By releasing the model to open source, Yahoo hopes it can be improved upon to help prevent unwanted images from being presented. You can access the algorithm at github.com/yahoo/open_nsfw.

Related Items:

Kaggle Tackles Whale of an Identification Problem

Three Unique Ways Machine Learning Is Being Used In the Real World

Datanami