re-Captcha, nanojobs and GWAP

February 20, 2011

Probably the most clever idea I ever heard of.

This is not new but it keeps astonishing people when I tell them about it: Did you know that Captchas help scanning books? And they are doing it very very well.

facebook recaptcha — A captcha seen on the facebook registration page

You all know about captchas: Images containing words that you are forced to type to make sure you are a human and not a robot when performing various actions on the Internet (create an account, write a comment…) in order to fight spam. Thousands of people are decoding them everyday. Everyone of them is doing a small mental effort to read the words.

And this is where the guys from the re-Captcha project had a brilliant idea: What if these thousands mental efforts could be used to actually do something useful? Like helping scanning books ?

Today, many organizations are scanning old books and transforming them in a digital format. The OCR software that transforms the scanned image to digital text may sometimes not be able to do its job correctly (however complex the software may be). re-Captcha uses the human brain to actually decode words the computer is not sure about:

What you see when you look at a re-Captcha are two words.
Among those two words, one is known by the server. The other one is a word the computer knows it didn’t managed to read properly.
You being a human or not will be judged on the first word, and the result you enter for the second word will be used to decode it.
Then we can imagine that if a certain number of users read the same for a given unknown word, it is most likely to be the right translation.
By using this technique, the system is able to digitalize texts with a higher precision that most of the other OCR systems.

The company behind re-Captcha, its data and its market-share were acquired by Google in 2009. (what else would you have imagined)

To me this system is brilliant: it solves a problem by dividing it in such simple tasks that they can be executed by people who don’t even notice that they are working. (And what’s nicer with this one is that it helps fighting spam and digitalizing books, two great causes.)

nano jobs

I don’t know if there is another term, but I call this nano jobs.

Let us take another example of nano job: in 2006 a professor released a fun tool where you can play with a random other player: an image was displayed and your goal was to find common words describing this image with the other remote player. Of course, you quickly realize that this was only done to help labeling the image base: Today, contrary to a human, a machine has difficulties to understand what an image represents (Image recognition). The “find common words to improve your score” is just a incentive to gather a lot of data. Google did the same to help labelling its image base.

Playing the ESP game with a random player, finding common labels for a given image.

This leads us to another important point in nano jobs: game mechanics.

You cannot force people into doing small tasks, they have to do them by themselves. In the case of re-Captcha, they understand the need of fighting spam, so they accept the task. In the case of the ESP game they want to do the best score or maybe to have fun with a random web user (this reminds me chatroulette).

These games are called games with a purpose (GWAP). Imagine, the workforce that these millions of people farming like zombies on Farmville represent (Unfortunately, Farmville business model is more in selling your data and selling stupid virtual stuff than making you doing nano jobs). Then, when we hear about Google investing in social game companies, I think nano jobs are part of their motivation (not the only one of course).

My conclusion

To conclude, I think this de-centralized and effortless way of solving problems is extremely powerful. Once again, divide and conquer seams to be the strategy to adopt, even for problems that don’t seam scalable.

Some more examples

gwap.com seams to specialize in Games With A Purpose. You can play to help tagging images, music, find synonyms, image tracing, emotions from images, image similarity and video labeling.
GOOG-411: Google opened a vocal search phone service to provide search results by phone. It seams that the goal was for Google to gather a lot of voice records to improve their speech recognition engine. (to provide data for machine learning)
On a similar way, Picasa performs face recognition, but it’s not perfect and you have to help it tagging your family and friends in your pictures. Well, the more you help it, the more it will be accurate later, and on a larger scale, the more Google is gathering learning data.
Google, once again, provides a free Translator toolkit to help doing sentence to sentence translation. This tool is free, but what you may not know is that I bet we are feeding Google with translation data by using this tool.

To another extent, Amazon is providing an online service called Amazon Mechanical Turk. It links nano-jobs providers with a widespread user base doing small tasks for money. I heard many companies are using this platform to help performing Human Intelligence Tasks.