October 11, 2013

Kaggle Contest Aims to Separate Cats from Dogs

Isaac Lopez

“Deep Blue beat Kasparov at chess in 1997. Watson beat the brightest trivia minds at Jeopardy in 2011. Can you tell Fido from Mittens in 2013?” This is the message that data scientists are greeted with as part of the latest machine learning competition run by Kaggle. The challenge: building an algorithm that can distinguish cats from dogs.

In this latest machine learning standoff, the data science dungeon masters at Kaggle have gotten their hands on a data set from Microsoft Research that contains over three million photos of dogs and cats from the world’s largest site for locating homes for pets, Petfinder.com.

The images themselves have been gathered by a group of coders for the Asirra project, a human interactive proof that functions as a CAPTCHA (Completely Automated Public Turing Test to tell Computers and Humans Apart). As with other types of CAPTCHA programs, Asirra is used to give websites the ability to filter bots from actual users browsing among their pages.

This latest Kaggle competition is apparently aimed at defeating this particular CAPTCHA by giving an algorithm the brains enough to recognize the difference between the two furry friends. The contest, which is considered to be merely “playground” fun, is not being sponsored by a major technology interest, but is instead being done as a competitive diversion for those involved. First place for the competition is not a job interview opportunity at Facebook, but a $76 donation to the ASPCA (or animal charity of the winner’s choosing).

The challenge itself is actually quite considerable. “While random guessing is the easiest form of attack, various forms of image recognition can allow an attacker to make guesses that are better than random,” explains the Kaggle competition admins. “There is enormous diversity in the photo database (a wide variety of backgrounds, angles, poses, lighting, etc.), making accurate automatic classification difficult.“

In a 2008 paper published by Philippe Golle at the Palo Alto Research Center, Golle explained that the state of the art for this specific type of recognition is a classifier which is 82.7% accurate in telling apart the images of cats and dogs used in Asirra – which is enough to render the proof engine obsolete. Despite this, the competition masters at Kaggle are challenging their community to take this success to the next level.

“We have created this contest to benchmark the latest computer vision and deep learning approaches to this problem. Can you crack the CAPTCHA? Can you improve the state of the art? Can you create lasting peace between cats and dogs? Okay, we’ll settle for the former.”

Already, the competition has gotten off to a great start since it launched on September 25th. With 30 individuals registered for the competition, and 78 entries at this article’s publishing time, the top entry seems to have shattered the 82.7% mark with an accuracy score of 96.7% within two competition entries. The next highest score is 85.7%, raising some questions about the methods used for the leading scorer.

While the Kaggle community is generally good-spirited in their competition, the concern of cheating is always present. “This particular competition is not appealing to me because it will be a ‘try to catch a cheater’ competition,” wrote one community member in the Kaggle forums.

The competition is scheduled to run over the next three months, ending on Saturday, February 1, 2014.

Data Athletes and Performance Enhancing Algorithms

Raising a Pack of Data Scientists

Sectors: Other

Tags: competition, data science, kaggle

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Kaggle Contest Aims to Separate Cats from Dogs

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Kaggle Contest Aims to Separate Cats from Dogs

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link