Tag Archives: Scientist

Counting the Words

image

I’ve been looking recently at different ways that newspapers can add value to the news they produce, and one of them is using technology to better mine the information that’s available to bring out themes and nuances that might otherwise be lost. But does it always work?

The post popular page on the WSJ.com website at the moment is Barack Obama’s speech, which has dozens of comments added to it (not all them illuminating; but there’s another story.) What intrigued me was the text analysis box in the text:

image

Click on that link and you see a sort of tag cloud of words and how frequently they appear in the text of the piece itself. Mouse over a word and a popup tells you how many times Obama used the word. “Black,” for example, appears 38 times; “white” appears only 29. That’s nearly 25% fewer times.

image

Interesting, but useful? My gut reaction is that it cheapens a remarkable speech–remarkable not because of its views, but remarkable because it’s a piece of oratory that could have been uttered 10, 20, 50, maybe even 100 years ago and still be understood.

My point? Analyzing a speech using a simple counter is not only pretty pointless–does the fact he said ‘black’ more times than ‘white’ tell us anything? What about the words he didn’t use?–but it paves the way to speechwriters running their own text analysis over speeches before they’re spoken. “Hey, Bob! We need to put more ‘whites’ in there otherwise people are going to freak out!” “OK how about mentioning you were in White Plains a couple of times last year?”

Maybe this already happens. But oratory is an art form: it doesn’t succumb to analysis, just as efforts to subject Shakespeare to text analysis don’t really tell us very much about Shakespeare.

The Journal is just messing around, of course, experimenting with what it can to see what might work. We’re merely watching a small episode in newspapers trying to be relevant. And it should be applauded for doing so. But I really hope that something more substantial and smart will come along, because this kind of thing not only misses the mark, but is in danger of quickly becoming absurd.

Perhaps more important, it fails to really add value to the data. Without any analysis of the frequency of words, there’s not really much one can say to the exercise except, maybe, “hmmm.” Compare that with a Canadian research project a couple of years back which developed algorithms to measure spin in the 2006 election there. They looked at politicians’ use of particular words: “exception words” — however, unless — for example, and the decreased use of personal pronouns–I, we, me, us– which might imply the speaker was distancing him- or herself from what was being said.

That sounds smart, but was it revealing? The New Scientist, writing in January 2006, said the results concluded that the incumbent, Prime Minister Paul Martin, of the Liberal Party, spun “dramatically more than Conservative Party leader, Stephen Harper, and the New Democratic Party leader, Jack Layton.” Harper, needless to say, won the election.

Oh, and in case you’re interested, Shakespeare used the word “black” 174 times in his oeuvre, according to Open Source Shakespeare, and “white” only 148, 15% fewer occurrences. Clearly a story there.

How To Infect An Airport

Could it be possible to use Radio Frequency ID tags, or RFID, to transmit viruses? Some researchers reckon so. Unstrung reports that a paper presented at the Pervasive Computing and Communications Conference in Pisa, Italy, the researchers from Vrije Universiteit in Amsterdam, led by Andrew Tanenbaum, show just how susceptible radio-frequency tags may be to malware. “Up until now, everyone working on RFID technology has tacitly assumed that the mere act of scanning an RFID tag cannot modify backend software, and certainly not in a malicious way,” the paper’s authors write. “Unfortunately, they are wrong.”

According to The New Scientist the Vrije Universiteit team found that compact malicious code could be written to RFID tags by replacing a tag’s normal identification code with a carefully written message. This could in turn exploit bugs in a computer connected to an RFID reader. This made it possible, the magazine says, to spread a self-replicating computer worm capable of infecting other compatible, and rewritable, RFID tags.

An RFID tag is small — roughly the size of a grain of rice, the New Scientist says, and contains a tiny chip and radio transmitter capable of sending a unique identification code over a short distance to a receiver and a connected computer. They are widely used in supermarkets, warehouses, pet tracking and toll collection. But it’s still in the early stages of development. Which leaves it vulnerable. Until now, however, it was thought the small internal memory would make it impossible to infect. Not so, say the researchers.

So what would happen, exactly? RFID virus would then find its way into the backend databases used by the RFID software. The paper, Unstrung says, outlines three scenarios: a prankster who replaces an RFID tag on a jar of peanut butter with an infected tag to infect a supermarket chain’s database; a subdermal (i.e., under-the-skin) RFID tag on a pet used to upload a virus into a veterinarian or ASPCA computer system; and, most alarmingly, a radio-frequency bag tag used to infect an airport baggage-handling system. A virus in an airport database could re-infect other bags as they are scanned, which in turn could spread the virus to hub airports as the traveler changes planes.

So how likely is this? Not very, Unstrung quotes Dan Mullen, executive director of AIM Global, a trade association for the barcode and RFID industries, as saying. “If you’re looking at an airport baggage system, for instance, you have to know what sort of tag’s being used, the structure of the data being collected, and what the scanners are set up to gather,” he explains. Red Herring quotes Kevin Ashton, vice president of marketing for ThingMagic, a Cambridge, Massachusetts-based designer of reading devices for RFID systems, as saying the paper was highly theoretical and the theoretical RFID viruses could be damaging only to an “incredibly badly designed system.” Hey, that sounds a bit like a PC.

But he does make a good point: because RFID systems are custom designed, a hacker would have to know a lot about the system to be able to infect it. But that doesn’t mean it can’t be done, and it doesn’t mean it won’t get easier to infect. As RFID becomes more widespread, off-the-shelf solutions are going to become more common. And besides, what will stop a disgruntled worker from infecting a system he is using? Or an attacker obtaining some tags and stealing a reader, say, and then reverse engineering the RFID target?

My instinct would be to take these guys seriously. As with Bluetooth security issues such as Bluesnarfing, the tendency is for the industry itself not to take security seriously until someone smarter than them comes along and shows them why they should do.

News: Microsoft Takes on Google’s Customisable News

 Microsoft is taking on Google, at least in its news. The New Scientist says Microsoft is testing a a news-gathering web site that tailors the stories selected to individual users. Once MSN Newsbot is fully functional, Microsoft says the site will personalise results within 10 minutes of a user starting to browse.
 
Microsoft is not revealing exactly how its site will work. But experts say there are several possible types of algorithm that could be used. One is similar to those Amazon.com uses to recommend additional books a buyer might like. This algorithm analyses the other choices of people who have already bought the first book. A news site would instead group articles according to the reading patterns of previous users.

News: Beware Your GSM Phone

 Pointed out by OnlineJournalism.com, the daily news Weblog of the USC Annenberg Online Journalism Review, there’s a problem with your GSM phone. An Israeli scientist and his team, Reuters reports, have found a way to break into mobile phone calls, enabling them to know the calling party’s identity and even listen to the conversation. The call could be zeroed in on, even at the ringing stage and overheard from that point on.
Yikes. Mind you, I always assumed the spooks were monitoring my phone calls anyway. What a boring life they must lead.

News: Sony Goes It Alone, Again

 Sony, as usual, is developing its own version of something we thought everyone else had agreed on. This time it’s Bluetooth. The New Scientist says that Sony’s Interaction Laboratory in Tokyo is working on “point-and-connect” technology, a camera-based system that lets users instantly transfer data from a laptop or handheld computer to a device in close proximity connected to the same wireless network.
 
 
Gaze-Link uses the laptop’s camera to read a code displayed on a small sticker attached to each device. Software running on the laptop then automatically locates the device on the network. Hmm. I know Bluetooth is not working great right now, but as more and more devices have it embedded, I believe it may end up working out for us. The only advantage I can see for this technology is when one Bluetooth device won’t recognise, or ‘find’ another, even when it is sitting right in front of you.

News: Don’t Laugh, Your Email’s Coming

 
 Not sure whether to laugh or cry at this one. Or tiptoe quietly away. Researchers at Australia’s Monash University, the New Scientist reports, are working on software that would that automatically log you onto the nearest computer by listening out for your voice, or laugh, or footsteps. Microphones on each computer, Rachel Nowak writes, would pick up a person’s voice, or listen for familiar footsteps coming or going. The software would then recognise them and calculate where they are, using flocks of ‘intelligent agents’ – pieces of computer code that move from computer to computer. “The agents,” she writes, “close in on those computers where the person’s voice is loudest, until they pinpoint the nearest one.”
 
The agents — or sneaky little tattletales, depending on your point of view — would, upon realising that you were heading towards the Mars Bar dispenser, deliver your email to the nearest computer, or, upon hearing your rich baritone laugh by the water cooler, administer a pithy reprimand and remind you that your expenses are horribly overdue. I’m not sure I’m ready for this kind of life. We already have an accounts department.