I’m not a huge fan of the Black Eyed Peas (I even think David Guetta killed the group), but while watching at the “Just Can’t Get Enough” video clip, I enjoyed the work on light and noticed for about one second (at 0:54) a very subtle shot:
This beautiful blur is called a bokeh. It appears when an out-of-focus point of light is captured by the camera. Some photographers like to play with it.
Usually, the light spots make discs on the picture, but here they are hearts. How come ?
Well, the shape of a bokeh is linked to the shape of the aperture of the camera. So I bet the photographer used an heart-shaped aperture in this shot.
Of course, this effect could be computer generated, a convolution between a heart and the image does the trick, but let’s believe it’s not a fake.
I wanted to try it at home to see if anyone can do it, it appears that it’s pretty straightforward: just carve a heart in a card and put it in front of the camera. Then point some lights and manually set them out-of-focus. Results are pretty good.
This is not new but it keeps astonishing people when I tell them about it: Did you know that Captchas help scanning books? And they are doing it very very well.
You all know about captchas: Images containing words that you are forced to type to make sure you are a human and not a robot when performing various actions on the Internet (create an account, write a comment…) in order to fight spam. Thousands of people are decoding them everyday. Everyone of them is doing a small mental effort to read the words.
And this is where the guys from the re-Captcha project had a brilliant idea: What if these thousands mental efforts could be used to actually do something useful? Like helping scanning books ?
Today, many organizations are scanning old books and transforming them in a digital format. The OCR software that transforms the scanned image to digital text may sometimes not be able to do its job correctly (however complex the software may be). re-Captcha uses the human brain to actually decode words the computer is not sure about:
What you see when you look at a re-Captcha are two words.
Among those two words, one is known by the server. The other one is a word the computer knows it didn’t managed to read properly.
You being a human or not will be judged on the first word, and the result you enter for the second word will be used to decode it.
Then we can imagine that if a certain number of users read the same for a given unknown word, it is most likely to be the right translation.
By using this technique, the system is able to digitalize texts with a higher precision that most of the other OCR systems.
The company behind re-Captcha, its data and its market-share were acquired by Google in 2009. (what else would you have imagined)
To me this system is brilliant: it solves a problem by dividing it in such simple tasks that they can be executed by people who don’t even notice that they are working. (And what’s nicer with this one is that it helps fighting spam and digitalizing books, two great causes.)
I don’t know if there is another term, but I call this nano jobs.
Let us take another example of nano job: in 2006 a professor released a fun tool where you can play with a random other player: an image was displayed and your goal was to find common words describing this image with the other remote player. Of course, you quickly realize that this was only done to help labeling the image base: Today, contrary to a human, a machine has difficulties to understand what an image represents (Image recognition). The “find common words to improve your score” is just a incentive to gather a lot of data. Google did the same to help labelling its image base.
This leads us to another important point in nano jobs: game mechanics.
You cannot force people into doing small tasks, they have to do them by themselves. In the case of re-Captcha, they understand the need of fighting spam, so they accept the task. In the case of the ESP game they want to do the best score or maybe to have fun with a random web user (this reminds me chatroulette).
These games are called games with a purpose (GWAP). Imagine, the workforce that these millions of people farming like zombies on Farmville represent (Unfortunately, Farmville business model is more in selling your data and selling stupid virtual stuff than making you doing nano jobs). Then, when we hear about Google investing in social game companies, I think nano jobs are part of their motivation (not the only one of course).
To conclude, I think this de-centralized and effortless way of solving problems is extremely powerful. Once again, divide and conquer seams to be the strategy to adopt, even for problems that don’t seam scalable.
Some more examples
gwap.com seams to specialize in Games With A Purpose. You can play to help tagging images, music, find synonyms, image tracing, emotions from images, image similarity and video labeling.
GOOG-411: Google opened a vocal search phone service to provide search results by phone. It seams that the goal was for Google to gather a lot of voice records to improve their speech recognition engine. (to provide data for machine learning)
On a similar way, Picasa performs face recognition, but it’s not perfect and you have to help it tagging your family and friends in your pictures. Well, the more you help it, the more it will be accurate later, and on a larger scale, the more Google is gathering learning data.
Google, once again, provides a free Translator toolkit to help doing sentence to sentence translation. This tool is free, but what you may not know is that I bet we are feeding Google with translation data by using this tool.
To another extent, Amazon is providing an online service called Amazon Mechanical Turk. It links nano-jobs providers with a widespread user base doing small tasks for money. I heard many companies are using this platform to help performing Human Intelligence Tasks.
Let me remind you that anyone can freely find your address and phone number in phone books, and it has been possible for decades, at least in France.
So please, stop complaining about Facebook breaking into your privacy when it allowed apps to ask you to give them your phone number and address. If you want to fix your privacy leaks, start from the basics and remove yourself from phone books. Personally, I only blame Facebook for not making this sharing disclaimer clear enough.
Let’s imagine what movies could look like in the next years. I’m not going to talk about 3D without glasses or mega-super-HD, but some other new ideas realistically accessible with today’s technology.
Dawn of the Dead
We often say actors will be replaced by digital characters. I can assure you that today doing so costs a lot of money and that it will take time before these costs will be lower than fees of a not so famous actor. They won’t replace real actors, they will replace dead ones!
We saw movies with non-humans main characters (think about Jar Jar Binks or Golum). But today, technology is developed enough to be able to create virtual main human characters who do not fall in the “uncanny valley“.
But what about real actors who cannot act anymore ? Or what about dead movie legends ?
That’s why we may see some new epic Star Wars episodes with actors from the original trilogy. As an example, we recently saw Arnold Schwarzenegger acting in “Terminator Salvation”. He was entirely digital.
It raises some issues, can I decide to use Marilyn Monroe as a main character ? Who owns her image today ? Is this really ethical ? Can we credit with the real name of the dead actor ?
Dynamic content (i.e. better product placement)
Pixar constantly raises the quality of its movies. Recently I could see in “Up” that some shots were internationalized: For example, In the French version, French words were written on a crate. So I assume Pixar re-rendered this movie for every location.
Of course, this is something that can only be done using computer imagery. And what if this content wasn’t static ? What if it could be adjusted to the audience or the time dynamically?
From an artistic point of view that’s great, imagine a newspaper showing the headlines from today’s events, it improve the immersion of the spectator.
But this kind of innovation is rarely driven by art. On the contrary, advertisers will be able to do some localized product placement or advertising.
I think two actual factors will lead soon to streamed and localized content into movies:
screens will soon be entirely connected to the internet and thus content delivered can easily be adjusted by broadcasters
3D image rendering operations are now cheap for broadcasters (in term of computer power needed).
It’s you, in the movie.
Then we could go deeper and imagine customizing the movie: take two or three pictures of your head, fill in some morphological details and enjoy a movie where the main character is yourself.
This can easily be done with computer generated movies. But I think it’s also feasible to replace a face in a traditionnal movie with technologies involving advanced face detection, realistic face reconstruction and complex compositing to mix the two. Example (watch until the end please):
It’s not new, the customization of the content can already be seen in video games.
Cinema may blend with video games, the story may not be linear. You may choose at some point the next part of the script, choose to save someone’s life or to let him die for example. (Remember this awesome Tipex Youtube advertisement ?)
This is something I experimented with in 2003 using simply Adobe Flash (in this project), and this is something that starts to be seen online now (using Youtube annotations or more complex systems)
All these ideas lead us to think about the definition of the cinema. Is it still the artist’s vision if we can change elements from the movie ?
Note: This article was drafted in November 2010, I found today that Georges Lucas announced in December 2010 to be planning to use dead actors in a movie. I guess that’s why I really need to use Beansight.
Great news for my startup project Beansight: we have been selected to be part of Le Camping. Le Camping is the first French tech startup accelerator program from Silicon Sentier.
Beginning January the 4th, we will be working everyday with 11 other startups at the prestigious Palais Brongniart. We will be mentored by great mentors that have many different backgrounds (entrepreneurs & CEOs, tech, marketing, designers…)
Only 12 teams out of 164 have been selected to be part of the adventure. I think it is a great opportunity for us to be part of the program.
Because we now have a somewhat working platform our first work will be to polish the edges of our user experience and to focus on our core algorithm. I think we can expect an initial release at the end of January.
By the way, if you want to follow more closely my work at Beansight, I invite you to follow my labs.
I’m sure you know about twitter. If you use it, you read and share links, ideas and statuses… with everyone.
Now imagine the same tool, but restricted to your team. This is StatusNet. (and it’s even much more than this actually)
At Beansight, we’ve been using StatusNet since the beginning. Examples: You start working on something? Take 3 seconds to post it on your StatusNet. You spotted an interesting link? Share it with your team on StatusNet. You’ll be late for the meeting? StatusNet. You’ve got a problem? StatusNet.
If it is well used, it’s like sharing the same mind. You are aware of what the others are doing or thinking and you can reply to them in real time. Isn’t this awesome when you work in an agile environment?
Did I mentioned it costs nothing ? You can start using it privately with your team for free at status.net/cloud. (iPhone, Android or desktop app included)
Of course, we keep using e-mails for structured threads.
I know there are some other tools, and they may be better at doing this (Wedoist, Yammer…). So here comes the second part of this post:
To be more precise, StatusNet is an open source microblogging platform. This means that you can deploy it on your own servers without having to rely on a particular service provider. It is branded under your name. And you own your data. Trust me, this is something large companies are looking for (A startuper thinks it’s Ok to use Google Apps for work, now try to explain this to a big company).
StatusNet can be federated, which means that public nodes can talk to each other. They all create a global and distributed network. Moreover, it uses an open protocol. You know, the same way it works for e-mails, where you can talk to someone who is not necessarily using the same provider than you. We tend to forget how important it is for innovation to keep the pipes opened.
I think open procotols matter. Can’t you see a problem in the previous image? Haven’t you learned that relying on one ressource is dangerous? Apparently many don’t see the issue and even try to build a business upon this closed API.
You can follow me (@steren) on identi.ca, a very popular public instance of StatusNet.
Last precision: Google Buzz is also promoting open protocols. Unfortunately, they are different from StatusNet. I hope these two systems will become compatible in a near future. Edit: Evan (StatusNet creator), tells us in the comments that you can already follow a Buzz user with StatusNet and that they are working together to support interoperability.