CAPTCHA: The story behind those squiggly computer letters

  • Article by: STACEY BURLING , Philadelphia Inquirer
  • Updated: July 24, 2012 - 5:45 PM

Those little puzzles that help distinguish humans from 'bots' indeed are getting harder.


A screen grab from the website In the ReCAPTCHA program, offered free by Google, users have to decipher two words, one from an old book.

If you use the Web, you have probably encountered an annoying invention called a CAPTCHA.

They're the squished-up, stretched and squiggled, color-blotched collections of letters that often must be deciphered before sending an e-mail, posting a comment, or buying a ticket.

Is that an "i" or an "l"? you wonder. A zero or the letter "O"? Maybe you see three letters where it seems there should only be two. You tilt your head. You scoot your chair back and squint. You wonder if you need new glasses.

You might also wonder if these things are getting harder -- maybe too hard for people with aging eyes and brains.

The CAPTCHA was created at Carnegie Mellon University in 2000. The name is short for Completely Automated Public Turing test to tell Computers and Humans Apart. Websites need CAPTCHAs to guard against the "bots" of spammers and other computer underworld types.

"Anybody can write a program to sign up for millions of accounts, and the idea was to prevent that," said Luis von Ahn, a Carnegie Mellon professor who was part of the CAPTCHA team. The little puzzles work because computers are not as good as humans at reading distorted text. Google says that people are solving 200 million CAPTCHAs a day.

Over time, though, the bad guys' computers have been getting smarter and, well, people have not. The CAPTCHAs have to get harder for users, because they're easier for the computers.

"It's an arms race between site owners and spammers; users lose," said Jeremy Elson, a researcher at Microsoft Research who has developed a CAPTCHA called Asirra. It uses pictures of dogs and cats.

Von Ahn said there were now "probably hundreds" of different kinds of CAPTCHAs. He worked on one of the biggies, ReCAPTCHA. Google bought that one and now offers it for free. Users have to decipher two words for ReCAPTCHA. One of them, usually the easier one, is lifted from an old book. A computerized scanner has failed to read it properly, and ReCAPTCHA users get a chance to do the job right, thereby helping Google digitize books.

Von Ahn said he thinks some kinds of CAPTCHA have been getting harder. ReCAPTCHA is harder than it was in 2000, but it has been at about the same difficulty level for the past two years. On average, he said, people spend nine seconds solving a ReCAPTCHA, and 92 percent of them get it right. In 2000, the success rate was 97 percent. The letters will be made more distorted when too many spammers start getting in.

Von Ahn said he did not know how many people give up when they see a hard CAPTCHA or ask for new words. He also did not know whether older people had more trouble than young, but there's reason to wonder.

Robert Sergott, a neuro-ophthalmologist at Wills Eye Hospital in Philadelphia, said seniors were more likely to have cataracts, glaucoma and macular degeneration -- eye diseases that can make vision blurry, especially when there is low contrast between letters and their background. Older people read best when there's high contrast and more space between letters, pretty much the opposite of what some CAPTCHAs offer.

"A lot of younger people have visual problems too," Sergott said. "I've had errors doing it. I think everybody has. How are you going to balance security without making this an impossible task for certain individuals?"

Rachel Greenstadt, a computer-science professor at Drexel University who specializes in the intersection between artificial intelligence and security, said there were audio alternatives to the written CAPTCHAs. ReCAPTCHAs uses spoken words and a lot of background noise. They're "even harder to solve, and they're easier to break," she said.

In 2009, Harry Hochheiser, an assistant professor of biomedical information at the University of Pittsburgh, did a small study of audio ReCAPTCHAs. It involved five blind people, including one with some residual vision. They got the audio CAPTCHAs right 45 percent of the time, and it took them 65 seconds to complete the task.

He says he's not sure what the solution is, but he wonders whether some websites need so much security. "It's quite possible that there are people out there who are getting discouraged by the difficulty," he said.

  • get related content delivered to your inbox

  • manage my email subscriptions


Connect with twitterConnect with facebookConnect with Google+Connect with PinterestConnect with PinterestConnect with RssfeedConnect with email newsletters