CAPTCHA Gets Useful

Captcha1

An excellent example of something that leverages a tool that already exists and makes it useful — CAPTCHA forms. AP writes from Pittsburgh:

Researchers estimate that about 60 million of those nonsensical jumbles are solved everyday around the world, taking an average of about 10 seconds each to decipher and type in.

Instead of wasting time typing in random letters and numbers, Carnegie Mellon researchers have come up with a way for people to type in snippets of books to put their time to good use, confirm they are not machines and help speed up the process of getting searchable texts online.

”Humanity is wasting 150,000 hours every day on these,” said Luis von Ahn, an assistant professor of computer science at Carnegie Mellon. He helped develop the CAPTCHAs about seven years ago. ”Is there any way in which we can use this human time for something good for humanity, do 10 seconds of useful work for humanity?”

The project, reCAPTCHA, is using people’s deciphering to go through those books being digitized by the Internet Archive that can’t be converted using ordinary OCR, where the results come out like this:

Captcha2

Those words are sent to CAPTCHAs and then the results fed back into the scanning engine. Here’s the neat bit, though, as explained on the website:

But if a computer can’t read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here’s how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

Which I think is kind of neat: the only problems might occur if people know this and mess the system by getting one right and the other wrong. But how do they know which one?

Tracking People With A Cellphone

Can services which allow you to track another person’s whereabouts be abused to monitor the movements of loved ones, employees etc without their knowledge?

David Brake of Blog.org cites an article on Korea’s OhmyNews.com site that says yes. As he points out, there are plenty of services that offer this service with built-in safeguards to ensure the person being tracked has given his/her permission. In the UK there’s Verilocation and Where RU, for example.

But the OhmyNews article would seem to confirm that such safeguards are easily bypassed. The article, written by Jennifer Park, an OhmyNews intern about to begin her freshman year at Carnegie Mellon University, points to two cases in Korea she says illustrate the relative ease with which folk can monitor people without their knowledge.

One involves a woman subscribing to a location-based service without his knowledge, finding and entering the correct PIN code to register for the ‘search friend’ service. She was then able to “trace her boyfriend block by block” to an accuracy of 10 metres. That, Jennifer says, was enough to be to tell the woman’s boyfriend was at a specific bar.

Jennifer Park also points to a recent case at Samsung, the Korean conglomerate, where civic groups allege that nine employees of Samsung SDI who were trying to set up a labor union were placed under surveillance by their managers by hacking into their cell phones. According to their attorney, Jennifer writes, the hacking was done by finding the cell phone’s identification number and using it to duplicate it. Then the hacker was able to subscribe to the service.

According to the Korean daily Chosun Ilbo, prosecutors began investigations in July into the case. Korea also prosecuted a group earlier this year which had “illegally copied the phones of the female employees of an entertainment establishment and put them under surveillance after secretly installing location-tracking systems”, the newspaper said.

Homeland Virus Alerts – What Happened?

The big anti-virus vendors often stand accused (rightly) of exaggerating the danger and impact of viruses; Not surprising they do that, they make money out of protecting people from viruses. But why would the U.S. government do it?

Here’s a great piece by Mary Landesman of about.com complaining about US CERT, a newly formed partnership between the U.S. Department of Homeland Security’s National Cyber Security Division and the CERT Coordination Center (CERT/CC) run by Carnegie Mellon University. After quoting their blurb — “We have taken great care to be accurate, fair, and honest about the security risks you face, and we feel a tremendous professional obligation to bring you the best, most trustworthy advice we can to help you protect your systems” — she then quotes their first alert (TA04-028A), which was sent out twice: “MyDoom.B Rapidly Spreading”.

Er, no. MyDoom.A — the original version — was big, . MyDoom.B, in her words, is “barely a blip on the radar”. Here’s the data so far:

  • Sophos: er, one copy.
  • Messagelabs: er, 7 copies.
  • Trend Micro: er, 1 copy.

You get the idea. MyDoom.A was big. MyDoom.B is not. So what went wrong? Well it’s early days, so perhaps we can put it down to teething troubles. But it’s not that simple. What I find a bit disturbing is that US-CERT, it appears, have not so much corrected their error as pretended it never happened. The original, incorrect, alerts can only be found on other sites (Google search) but only an ‘updated’ version (without the ‘rapidly spreading’ bit) can be found on US-CERT. Good that they’ve realised their error, but they don’t seem to be acknowledging it: The revision history for this report refers only to a version on Feb 2 that “Updated hosts file and www.microsoft.com information, changed heading formats”.  Nothing about “removing misleading and horribly incorrect information about spread of virus”. From where I’m sitting (and I may be wrong here), this looks like someone has tried to forget the original reports ever existed.

There are, quite obviously, a few problems with this. What happens to all those folk who have acted on the original reports? I can see it posted at more than 300 sites, where presumably people are cowering under their desks, switching off computers, and wearing gas-masks. How are these people going to know the original report was wrong if you pretend it never existed?

It’s all about credibility. Commercial anti-virus firms do a good job of analysing viruses and a slightly less good job of quickly updating your software so you don’t get infected. They also try to give an accurate idea of how far and how fast the virus is spreading. But do we believe them when they put out press releases saying how much damage viruses cost? Not usually, because we know these folk make money based on how big the problem is. The whole point of something like US-CERT is to bring some impartialitiy to the scene. But that’s not going to work if a) the original reports are horribly wrong and b) if the error is compounded by not ‘fessing up to the error and letting people know what you’ve corrected.

I’ve sought clarification from US-CERT.