Luis Von Ahn: Can You Crowdsource Without Even Knowing It? Computer programmer Luis von Ahn wondered how else to use small contributions done by millions on the Internet for greater good. He put the online puzzles CAPTCHAs to work by digitizing books.

Can You Crowdsource Without Even Knowing It?

  • Download
  • <iframe src="" width="100%" height="290" frameborder="0" scrolling="no" title="NPR embedded audio player">
  • Transcript


It's the TED Radio Hour from NPR. I'm Guy Raz. And on the show today, the chaos and the power of collaboration. What happens when hundreds or millions of people contribute toward a singular goal? Well, there's a good chance, a very good chance that you've actually been part of a huge online project unknowingly.

But more on that later. First, if you've ever bought a concert ticket online or signed up for Gmail, there's a box that pops up with words that look kind of wavy, and you have to type in what you think those words are.

LUIS VON AHN: And the reason it's there is to make sure that you, the entity filling out the form, are actually a human and not a computer program that was written to submit the form millions of times.

RAZ: It's like the most annoying thing ever.

VON AHN: I'm to blame. I'm sorry.

RAZ: The man to blame is Luis von Ahn, inventor of CAPTCHAs. I couldn't get tickets to Beyonce this summer 'cause I couldn't figure out the CAPTCHAs.

VON AHN: Did you really want to see her?

RAZ: Yeah, and she put on another show - two other shows in Washington; they were still sold out.

VON AHN: (Chuckling) Yeah, sorry.

RAZ: So CAPTCHAs are actually designed to prevent automated computer programs from doing things like buying up all those Beyonce tickets because those programs cannot read those squiggly lines, believe it or not, as well as humans can. And so today, every single day, almost 200 million CAPTCHAs are typed into computers around the world.

VON AHN: When I first heard this - well, first, I was proud. I thought, look at the impact that my work has had. But then I started feeling bad because not only are they annoying, they're also - each time you type one, you waste about 10 seconds of your time. And if you multiply 200 million by 10 seconds, you get that humanity, as a whole, is wasting, like, 500,000 hours every day.

RAZ: Man, imagine what you could do with those 500,000 hours.

VON AHN: Yeah, so that's exactly what I was thinking about.

RAZ: Here is Luis's TED Talk.


VON AHN: But I started thinking, is there any way in which we could use this effort for something that is good for humanity. So see, here's the thing, when you're typing a CAPTCHA, during those 10 seconds, your brain is doing something amazing. Your brain is doing something that computers cannot yet do. So can we get you to do useful work for those 10 seconds?

Another way of putting it, is there some humongous problem that we cannot yet get computers to solve, but that somehow we can split into tiny 10-second chunks, such that each time somebody solves a CAPTCHA, they solve a little bit of this problem. And the answer to that is, yes, and this is what we're doing now. So what you may not know is that nowadays, when you're typing a CAPTCHA, not only are you authenticating yourself as a human, but in addition you're actually helping us to digitize books.

And the basic idea is you start with a book. You've seen those things, right? Like a book.


VON AHN: OK. So you start with a book, and then you scan it.

SYNTHESIZED VOICE: It was the best of times, it was the worst of times. It was the age...

VON AHN: The next step in the process is that the computer needs to be able to decipher all of the words in these pictures.

SYNTHESIZED VOICE: It was the season of light. It was the best of turns. It was the best of trends. It was the best of transfers.

VON AHN: That, unfortunately, is not very accurate. For older books, the ink has faded, and so the pictures look as if you have taken a photocopy of a photocopy of a photocopy of some book. So it looks pretty distorted. So computers can't read them very well, but humans can.

So what we're doing now is we're taking all of the words that the computer cannot recognize and we're getting people to read them for us while they type a CAPTCHA on the Internet. So next time you type a CAPTCHA, those words that you're typing are words that are coming directly from books that the computer could not recognize and we're using what you're entering to help digitize the books.

RAZ: And then that word, what, gets sent back to whoever is digitizing that book and it sort of automatically fills in that part of the puzzle?

VON AHN: That's exactly right. It puts it back in there and then that's it. It goes on to the next word.

RAZ: So, like, more than a billion people have helped digitize books and they have no idea?

VON AHN: Yeah. That's basically what I work on. That's my work. It's taking things that get done by millions of people and try to reuse or recycle the human mental energy towards something else.


VON AHN: The question that motivates my research is the following: If you look at humanity's large-scale achievements - these really big things that humanity has gotten together and done, like, historically, like, for example, building the pyramids of Egypt or the Panama Canal, or putting a man on the moon - there's a curious fact about them and it is that they were all done with about the same number of people. It's weird. They were all done with about 100,000 people. And the reason for that is because, before the Internet, coordinating more than 100,000 people, let alone paying them, was essentially impossible.

So the question that motivates my research is, if we can put a man on the moon with a 100,000, what can we do with a hundred million?

RAZ: The answer? It's Luis's latest project. It started about three years ago. He had just sold his second company to Google.

VON AHN: You know, I retired. I retired for, like, a day.

RAZ: I would.

VON AHN: I was going to watch a lot of TV is was what I was going to do, but then I got bored.

RAZ: Which got him thinking about this question.


VON AHN: How can we get 100 million people translating the web into every major language for free? So I would like to translate all of the web, or at least most of the web, into every major language. OK, so that's...

RAZ: Why couldn't you just use, you know, language software to do it?

VON AHN: It turns out they're just not very good. You would never see a book translated by Google Translate, for example. Even when it's accurate, you don't even know whether to trust it or not because it's so inaccurate the rest of the time. You know, computers can't yet do this. This is why language translation is such a big business is because computers can't do it very well.

RAZ: And there are other hurdles.


VON AHN: The other problem that you're going to run into is a lack of motivation. How are we going to motivate people to actually translate the web for free? And this is normally - you have to pay people to do this, so how are we going to motivate them to do it for free? Now, when we were starting to think about this...

RAZ: And so what Luis came up with is a project and website called Duolingo, and here's how it works. You start by learning a language by typing in basic words, like girl...


RAZ: ...Or boy.


RAZ: But now, remember, Luis wanted to harness all that potential human energy into one huge project.

VON AHN: So you learn a language, but while you're learning, you're also helping to translate the web.


VON AHN: OK, so the way this works is whenever you're just a beginner, we give you very, very simple sentences. There is, of course, a lot of very simple sentences on the web.

SYNTHESIZED VOICE: Ella es una nina.

VON AHN: We give you very, very simple sentences, along with what each word means.

SYNTHESIZED VOICE: Ella - she - es - is - una - a - nina - girl.

VON AHN: OK, and as you translate them enough and as you see how other people translate them, you start learning the language. And as you get more and more advanced, we give you more and more complex sentences to translate.

SYNTHESIZED VOICE: (Speaking Spanish)

VON AHN: But at all times, you're learning by doing.

RAZ: OK, so you get better and better at it and then you basically give the user, like, a news article or something and you ask them to translate it.

VON AHN: Yep. That's the basic idea.

RAZ: And then, how do you know it's accurate?

VON AHN: It's basically multiple people translate the same thing and it turns out it's really accurate. We've measured and it's as accurate as a single professional translator.

RAZ: How fast would it take to translate all of Wikipedia into, say, five or six or 10 major languages?

VON AHN: It could be done in really a matter of weeks if we were entirely doing just Wikipedia and had enough people. So, for example, if there were a million people learning English from Spanish, for example, we could translate it in something like 80 hours.

RAZ: All of Wikipedia into Spanish?

VON AHN: Well, if there are a million people going at it, they can do it quickly.

RAZ: Do you think that the key to making this work, to getting millions, tens of millions of people to collaborate, to participate, is you have to give them something back?

VON AHN: Yeah, I think so. It's very hard to mobilize 10 million people, 100 million people. I actually think, even for nice causes, you won't get 100 million people helping to do it if you just have a good cause. I think in most cases you just - you have to give something back.


RAZ: Luis von Ahn. You can find out more about Duolingo and watch Luis's full talk at

RAZ: I've put in la femme, l'homme many many times...


RAZ: I'm pretty sure that I've translated the woman and the man in many French websites.

VON AHN: Yeah, so that actually - you probably weren't doing any useful translation there.

RAZ: Oh, sorry. I'm just trying to, you know, pitch in a little bit.

VON AHN: Oh, sorry.


UNKNOWN VOCALIST: I'll fill your CAPTCHA in; it's not case-sensitive...

Copyright © 2013 NPR. All rights reserved. Visit our website terms of use and permissions pages at for further information.

NPR transcripts are created on a rush deadline by an NPR contractor. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.