Google Replaces CAPTCHA With reCAPTCHA, A More Effective Way To Decide Who Is Human
AUDIE CORNISH, HOST:
If you've ever signed up for anything online, you've probably taken a CAPTCHA test, an automated test to distinguish humans from machines. Click on all the images that contain a stop sign, check the boxes that show cats. But as Jacob Goldstein from our Planet Money podcast reports, the designers of the test have a problem. Machines are getting so smart that they'll soon be better than humans at passing the test.
JACOB GOLDSTEIN, BYLINE: The most popular version of the I-am-not-a-robot test is made by Google. It's called reCAPTCHA. And the engineer who runs Google's reCAPTCHA team is named Aaron Malenfant One day a few years ago, he was telling his boss about all the things his team was trying to do to stay ahead of the machines. But eventually, Malenfant just had to break it to her: pretty soon, the bots will be able to solve any challenge.
AARON MALENFANT: And I mentioned to her that in the next three to five years, the current challenges are no longer going to be working. We need to move to a new system. And so she turned to me and just asked me, why aren't you doing that?
GOLDSTEIN: So he did. Malenfant and his team started building a new kind of reCAPTCHA where there is no test at all - not even one of those boxes you check that says I am not a robot. Google released this new version last year. Sites are still in the process of switching over.
In the new version, there is just a little notice at the bottom of the page. If you don't see that, you don't even know reCAPTCHA's analyzing you to decide if you're a human. The way it works is reCAPTCHA does the analysis and sends a score back to the website. That score is the probability that you're a human. Then, it's up to the website what to do. Maybe they make you log in again; if you're submitting a comment or a review, maybe they send it to moderation.
And now, there's a big question here that Google hasn't really answered publicly. Without the test, how is Google estimating the probability that you're a human? I asked Aaron Malenfant, the Google engineer.
MALENFANT: The reason we don't say too much is that we do have adversaries trying to beat us at all times. We do say publicly that we adapt to a particular site and behavior for that site.
GOLDSTEIN: In other words, for each site that installs this new version of reCAPTCHA, Google's computers analyze specific behavior for that site. It might include how users move the mouse or scroll down the page. And then, every time a user comes to that site, the machines say - is this user doing what a person normally does on this site, or are they acting weird? So the test works differently for every site.
MALENFANT: Just because you can get a good score on one website, if you're an attacker, doesn't mean that you can get a good score on all the websites.
GOLDSTEIN: We also know that in older versions of reCAPTCHA, Google used information about whether a user had visited Google sites and was logged into a Google account. Is Google still using that information in the new reCAPTCHA?
MALENFANT: I would say it matters a lot less.
GOLDSTEIN: A lot less than what?
MALENFANT: Than it used to.
GOLDSTEIN: OK. OK.
MALENFANT: Our goal is that it doesn't matter.
GOLDSTEIN: That's your endgame?
GOLDSTEIN: You're not there yet, but you're getting there.
MALENFANT: Well, I don't know if I even want to...
MALENFANT: I probably already said too much.
GOLDSTEIN: A few days after I talked with Malenfant, I spoke with a Google spokesman. And he was able to tell me that it still matters in reCAPTCHA whether or not a user has an active Google account. But he said reCAPTCHA does not use any information about what the user does on his or her account.
Jacob Goldstein, NPR News.
(SOUNDBITE OF OOFT!'S "MAZING (ORIGINAL MIX))
NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.