The End of Boorish Intrusion

(This is a copy of my Loose Wire Sevice column, produced for newspapers and other print publications.)

By Jeremy Wagstaff

One of the ironies about this new era of communications is that we’re a lot less communicative than we used to be.

Cellphones, laptops, iPhones, netbooks, smartphones, tablets, all put us in touching distance of each other. And yet, perversely, we use them as barriers to keep each other out.

Take the cellphone for example. Previously, not receiving a phone call was not really an option.

The phone would ring from down the hall, echoing through the corridors until dusty lights would go on, and the butler would shuffle his way towards it.

Of course, we were asked by the switchboard operator whether we’d take a call from Romford 230, but unless you were a crotchety old earl, or the person was calling during Gardener’s Question Time, you’d usually accept it.

Nowadays we generally know who it is who’s calling us: It tells us, on the screen of the phone. It’s called Caller ID. This enables us to decide whether or not to receive the call. And that’s where the rot sets in.

Some of us refuse to accept a call from a number we don’t recognize. It could be some weirdo, we think. Some of us will only take a call from a number we don’t recognize. We’re adventurous, or journalists sensing a scoop, or worried it may be grandma calling on the lam from Belize.

Some of us see a number from someone we know, and even then don’t take the call. Maybe we’re busy, or asleep, or watching Gardener’s Question Time.

The phone has changed from being a bit like the postman—a connection with the outside world, and not someone you usually turn away—to being just one of a dozen threads in our social web.

And, as with the other threads, we’ve been forced to develop a way to keep it from throttling us. Whereas offices would once be a constant buzz of ringing phones, now they’re more likely to be quieter places, interrupted only by the notification bells of SMS, twitter alerts or disconnecting peripherals.

I actually think this is a good thing.

I, for one, have long since rejected the phone as an unwelcome intrusion. I won’t take calls from people who haven’t texted me first to see whether I can talk, and those people who do insist on phoning me are either my mother or someone I don’t really care for.

What has happened is that all these communications devices have erased an era that will in the future seem very odd: I call it the first telecommunications age. It was when telephones were so unique that they dominated our world and forced us to adapt to them. We allowed them to intrude because most of us had no choice.

There was no other way to reach someone else instantaneously. Telegram was the only competitor.

Now we have a choice: We can choose to communicate by text, twitter, Facebook, Skype, instant message, email. Or not actually communicate directly at all: We can set up meetings via Outlook or Google Calendar, or share information without any preamble via delicious bookmarks or Google Reader.

Our age has decoupled the idea of communicating with the idea of sharing information. This is probably why we have such trouble knowing how to start a conversation in this new medium. When the communication channel between us is so permanent, when we know our friends are online because we can see them online, then communicating with them is not so much beginning a new conversation as picking up a new thread on one long one.

We have all come to understand this. We see each other online, we know everyone we’ll ever need to communicate with is just an @ sign away, so we all appreciate the tacit agreement that we don’t bother each other unless we really need to.

And then it’s with a short text message, or an instant message that pops up in a unobtrusive window.

In this world a ringing phone is a jarring intrusion, because it disrupts our flow, it ignores the social niceties we’ve built up to protect our permanent accessibility. It’s rude, boorish and inconsiderate.

Which was probably what people said about the introduction of the telephone. It’s only now that we realize they were right.

Nonsense Linking, Or the Rise of the Cheap Bot


I’m a big fan of The Guardian, but their auto-linking software needs some tweaking. It’s a classic example of trying to provide that extra value to data on the cheap.

My argument for a while has been that the only lasting way for traditional media to make itself competitive again is not to create more, but to create better.

In one key sense this is about injecting extra value into words: metatagging them, in short, so that other content belonging to the media—or others—adds context.

But this is not easy. Lots of people are trying it, and some are doing interesting stuff with it. But building a library of words that creates automatic links to categories within the one site, as The Guardian is doing, is not it.

Take the example above. It’s in an article written by a woman who has given up sex for a year (neat and easily sellable book idea, or what?). But in the example above, where she’s talking about her lack of love life as a young journalist (tell me about it) she mentions her dating experiences.

The Guardian’s autolinker parses the story at some point and inserts a link for the word ‘dating’ to the paper’s ‘Lifestyle and Dating’ section.

Another example appears lower in the story, where relationships are mentioned, leading inexorably to a link to the section on Relationships:


Now there’s nothing wrong with this story appearing in either of those sections (and it does), but to autolink these words to the section is meaningless. It’s out of context. It lacks context. It’s not contextual. It’ doesn’t add value.

Indeed, it cheapens all the good linking that is going on in The Guardian, because it reduces the reader’s trust in the value of all those links.

If you as the reader start to see links all over the place to places that don’t add value to what you’re reading, pretty soon you’re going to stop seeing those links.

So, Guardian, drop the autolinking bot and spend time thinking up a better way of adding value to your content. Metadata is too valuable, too important, to leave to cheap bots.

My year without sex, by Hephzibah Anderson | Life and style | The Guardian

Protect Your Privacy With Twiglets


I really hate being asked for lots of private details just to download a product. In short: People shouldn’t have to register to try something out. An email address, yes, if absolutely necessary.

But better not: just let the person decide whether they like it. It’s the online equivalent of a salesperson shadowing you around the shop so closely that if you stop or turn around quickly they bump into you. (One assistant in Marks & Spencer the other day tailed me so closely I could smell his breath, which wasn’t pleasant, and then had the gall to signal to the cashier it was his commission when I did, without his help, choose something to buy.) I nearly put some Marks & Spencer Twiglets up his nose but that branch doesn’t sell them.

Anywhere, latest offender in this regard is Laplink, who ask for way too much personal information just to download trial versions of their products, including email address, full name, address, post code, company name. Then they do that annoying thing at the end of trying to trick you into letting them send you spam with the old Three Tick Boxes Only One of Which You Should Tick if You Don’t Want To End In Every Spammers List From Here To Kudus Trick:


Rule of thumb there is to tick the third one in the row because it’s always the opposite of the other ones. As if we’re that stupid.

The other rule of thumb is never to put anything accurate in the fields they do require you to fill out. Not even your gender. Childish? Yes, maybe, but not half as childish as their not trusting you enough to decide whether you like the product on your own terms and not fill their spamming lists.

Of course the better rule of thumb is not to have anything to do with companies that employ such intrusiveness and trickery, but we’d never do anything then.

Technorati Tags: ,

Europe’s Top-heavy Leagues

Lg-spain Spanish Primera Liga (48%)
Lg-bundesliga German Bundesliga (54%)
Lg-epl2 English Premier League (47%)
Lg-france French Ligue 1 (47%)
Lg-greece Greek Ethniki Katigoria (6%)
Lg-holland Dutch Eredivisie (25%)
Lg-italy Italy Serie A (24%)

Lg-champ English Championship (29%)
Lg-scot Scottish Premier League (29%

This doesn’t have a lot to do with technology, but it’s an excuse to play around with sparklines, Edward Tufte’s approach to feeding data into text in the form of small data-rich graphics. And they might tell us a bit about soccer, competitiveness and which country is the powerhouse of Europe. (These ones are done with Bissantz’ excellent Office plugin.)

What started me off here was the comment on the BBC website that English soccer, while strong at the top (Man U, Chelsea, Liverpool, Arsenal), drops alarmingly in quality. Is there really no competition in the English Premier League? The absence of English clubs in the final 4 of the UEFA Cup would seem to indicate it’s true.

But I thought another way of exploring it would be to grab the points gathered by each team in each of the main European leagues, and then plot them as a simple sparkline, each bar indicating the points one by each club in the table. The steepness and evenness of the sparkline gradient should give a pretty clear impression of which leagues are split between great clubs and the mediocre rest.

Visually, Spain is clearly the most competitive league (with the exception of England’s second league, the Championship, which has an impressively smooth gradient.) The German Bundesliga comes second, with the English Premier League third. All the others, frankly, look too top heavy to be regarded as having any depth (Italy doesn’t really count as it’s in such a mess at the moment.)

The figures in brackets show how many points the bottom club has as a percentage of the top club, a figure that’s not particularly useful as, for example in Greece, the bottom club Ionikos doesn’t seem to has won only two games in 26.

Sudoku’s Secret: Open Source Collaboration

Great piece in the NYT/IHT on the company behind Sudoku and similar games. Their approach — no trademarking, harnessing users to help develop and perfect games — all sounds very Open Source:

clipped from

Nikoli’s secret, Kaji said, lay in a kind of democratization of puzzle invention. The company itself does not actually create many new puzzles — an American invented an earlier version of Sudoku, for example. Instead, Nikoli provides a forum for testing and perfecting them. About 50,000 readers of its main magazine submit ideas; the most promising are then printed by Nikoli to seek approval and feedback from other readers.

Mapping Your Tiddly Thoughts

I’m a big fan of TiddlyWiki, the personal wiki that runs in one file in your browser, and I’m very impressed by all the plug-ins and tweaks that the program’s users are introducing. (I wrote about TiddlyWiki last year in a column — subscription only, sorry — but have also included some notes for the piece here in the blog, including on this page (scroll down).

Anyway, TiddlyWiki is a free form database, not unlike an outliner, but with lots of cool elements that make it much more. (Yes, tags, too.) Think of lots of individual notes that you make in your browser, which you can find via ordinary search or by tags you give to each note; you can also view a list of notes chronologically — i.e. in the order you created them — etc etc.

But if you’re a fan of mindmaps, or PersonalBrain, where your information can also be viewed graphically, you might feel a tad constrained. Not for much longer, if a Java programmer and writer called Dawn Ahukanna has her way. She’s just released a “hypergraph plug-in” which creates what she calls navigation graphs (I’d call them mindmaps but that’s me). As she says, “I’ve had quite a few revelations with it already, using it to map my existing TiddlyWikis.”


It’s an early prototype and not as pretty as it could be, but this kind of thing is in my mind the thin wedge of a revolution largely ignored by the “social” Web 2.0. Tools like TiddlyWiki, though presently a little rough around the edges and geeky, mark a very useful exploration of different interfaces for personal, portable data.

While I think of it, another interesting new TiddlyWiki modification is the MonkeyGTD (Getting Things Done, to the few people who haven’t been sucked in by the David Allen book and self-organizing philosophy), which tweaks the TiddlyWiki interface into little blocks.

How to Make More Use of the Vicar

In last week’s WSJ column (subscription only, I’m afraid) I wrote about how Bayesian Filters — derived from the theories of an 18th century vicar called Thomas Bayes and used to filter out spam — could also be used to sift through other kinds of data. Here’s a preliminary list of some of the uses I came across:

  • Deconstructing Sundance: how a bunch of guys at UnSpam Technologies successfully predicted the winners (or at least who would be among the winners) at this year’s festival using POPFile, the Bayesian filter of choice;
  • ShopZilla a “leading shopping search engine” uses POPFile “in collaboration with Kana to filter customer emails into different buckets so we can apply the appropriate quality of service and have the right people to answer to the emails. Fortunately, some of the buckets can receive satisfactory canned responses. The bottom line is that PopFile provides us with a way to send better customer responses while saving time and money.”
  • Indeed, even on-spam email can benefit from Bayes, filtering boring from non-boring email, say, or personal from work. Jon Udell experimented with this kind of thing a few years ago.
  • So can virus and malware. Here’s a post on the work by Martin Overton in keeping out the bad stuff simply using a Bayesian Filter. Here’s Martin’s actual paper (PDF only). (Martin has commented that he actually has two blogs addressing his work in this field, here and here.)
  • John Graham-Cumming, author of POPFile, says he’s been approached by people who would like to use it in regulatory fields, in computational biology, dating websites (“training a filter for learning your preferences for your ideal wife,”, as he puts it), and says he’s been considering feeding in articles from WSJ and The Economist in an attempt to find a way predict weekly stock market prices. “If we do find it out,” he says, “we won’t tell you for a few years.” So he’s probably already doing it.

If you’re new to Bayes, I hope this doesn’t put you off. All you have to do is show it what to do and then leave it alone.  If you haven’t tried POPFile and you’re having spam issues, give it a try. It’s free, easy to install and will probably be the smartest bit of software on your computer.

I suppose the way I see it is that Bayesian filters don’t care about how words look, what language they’re in, or what they mean, or even if they are words. They look at how the words behave. So while the Unspam guys found out that a word “riveting” was much more likely to be used by a reviewer to describe a dud movie than a good one, the Bayesian Filter isn’t going to care that that seems somewhat contradictory. In real life we would have been fooled, because we know “riveting” is a good thing (unless it’s some weird wedgie-style torture involving jeans that I haven’t come across). Bayes doesn’t know that. It just knows that it has an unhealthy habit of cropping up in movies that bomb.

 In a word, Bayesian Filters watches what words do, or what the email is using the words to do, rather than look at the meaning of the words. We should be applying this to speeches of politicians, CEOs, PR types and see what comes out. Is there any way of measuring how successful a politician is going to be based on their early speeches? What about press releases? Any way of predicting the success of the products they tout?

technorati tags: , , ,

On News Visualization, Part II

This week’s Loose Wire column in WSJ is about visualizing news. Researching the column I had a chance to interview Craig Mod, the guy behind the excellent Buzztracker. Here’s an edited transcript of our chat:

Craig Mod: We have over 550,000 articles in the DB now, spanning back to Jan 1st 2004. “Buzztracker” went from 750 hits on google the day before the launch to now … 39,000+ which was suprising
Jeremy: when was the launch?
Craig Mod: About 3 weeks ago
Craig Mod: got slashdotted within 12 hours
Jeremy: could you walk me thro how you think people might use it, or derive benefit from it?
Craig Mod: sure. the project started about 2 years ago as a pure art project .. some of the original output was just the dots, with no map .. but the closer you looked, suddenly land masses began to emerge and you started forming associations
Craig Mod: I’ve obviously tried to make it a lot more pragmatic and functional now
Craig Mod: fundamentally it’s supposed to get people thinking about why these connections exist — why is Shanghai and Canada connected (during the SARS outbreaks)?
Craig Mod: How did the virus spread?
Craig Mod: What sorts of checks can you preform to prevent that sort of spreading?
Craig Mod: Is it possible?
Craig Mod: etc etc
Craig Mod: and from there begin to explore how these events are being covered
Jeremy: interesting.. is there a page for the SARS stuff in the archive?
Craig Mod: clicking on the locations obviously gives you a list of the articles they appear in
Craig Mod: unfortunately the SARS stuff happened when I was building the beta 2 years ago .. so it’s not in the current DB
Craig Mod: but the recent demonstrations in China have popped up a lot
Craig Mod: there’s a China-Tokyo-Jakarta triangle that appeared during the summits
Craig Mod: and you can click the “tomorrow / yesterday” buttons and see just how long these stories linger in the collective media conscience
Craig Mod: which is kind of fun

Jeremy: is there a danger the external links die off?
Craig Mod: There is .. and we orignally had links to our internal cache but .. obvious copyright infringements issues scared us away from keeping the feature on new articles
Craig Mod: although, we still have all the data, of course
Jeremy: yes, the copyright thing is tricky…
Jeremy: how do you plan to deal with that?
Craig Mod: By not publicly offering the articles
Jeremy: right.
Craig Mod: And by keeping advertising off the site .. keeping it as pure an art project / public service project as possible

Jeremy: tell me a bit about you.
Craig Mod: I’m 24
Craig Mod: Born in Hartford, CT
Craig Mod: graduated from UPenn 2 years ago — degree in Digital Media Design (BSE in Comp. Sci with a very strong Fine arts component)
Craig Mod: Came to Tokyo 4 years ago for a year abroad, came back 1 1/2 years ago to run the Tokyo component of a small publishing company I helped start
Craig Mod: So a total of 2 1/2 years in Tokyo
Craig Mod: 2 years of which was spent at Waseda University in the intensive language program
Jeremy: how’s your japanese now?
Craig Mod: Extremely functional but I still can’t “relax” with a novel (although I just finished Murakami Ryu’s Almost Transparent Blue in Japanese)

Jeremy: so what are your plans for buzz?
Craig Mod: Right now I’m working on re-writing the drawing routines in a more power language .. the plan is to produce super-high-resolution prints for gallery display
Craig Mod: but being the only guy working on this + running sales / pr for CMP in Tokyo means it unfortunately takes a while to rewrite components
Jeremy: when you say hi-res prints, you mean of the maps?
Craig Mod: Correct
Craig Mod: There is a lot of information being lost in the low resolution of comp. screens
Craig Mod: especially Buzztracker connections (the thin, light lines get lost)

Jeremy: with thinking gap donned, where do you see this kind of thing going? do you think as people turn more and more to the net for news, these kind of visual displays will catch on?
Craig Mod: I don’t think traditional news delivery will be subverted anytime soon, but I do think that as digitized nformation increases (digital photographs, journals, etc) people are going to need clean, effecient methods to engage with the data / find what they want
Craig Mod: Something like buzztracker is an attempt to both clean up the delivery of a tremendous amount of information while also brining to the surface patterns otherwise invisible — missing the forest for the trees, etc.
Craig Mod: but what I’m hoping … what I had in mind as I was designing and building the information structure of buzztracker was that things need to be as clear and simple as possible
Craig Mod: this isn’t meant to provide an incredibly exhaustive set of news mining features — it’s meant to be highly accessible by anyone
Craig Mod: I haven’t seen any of the other newsmap interfaces but perhaps unlike Marcos’ work or, hopefully, mine, their information architecture wasn’t as transparent
Jeremy: transparent meaning?
Craig Mod: meaning, they innundated the user with superfluous interface elements, cluttered typography, illogical hierarchies .. I don’t want anyone using buzztracker to be concerned with how they engage the software/site .. the focus should, I hope, be engaging the data, the news
Craig Mod: (although I don’t know if they did that since I never saw any of them 🙂 )

Craig Mod: on the tech side of things, there was a point where I was debating between flash and pure html .. in the end, I think going with html made sense for those exact reasons — quick loading, standards based, etc
Craig Mod: There’s also, I suppose (to a small degree) a sense of bias being eliminated in these sorts of ways of navigating the news ..
Jeremy: very true.
Craig Mod: But almost unavoidable .. but those biases are also interesting ..
Craig Mod: buzztracker being completely rooted in anglophone news sources
Craig Mod: you start to see things like .. Africa doesn’t exist in the mind of enlgish speaking sources .. most all news takes place on a thin line just above the center of the map

Craig Mod: Animations are also comming .. along side the high-res output ..
Jeremy: how would the animations work? evolution of a story over a period of time?
Craig Mod: you could follow certain keywords — allowing you to follow certain stories .. You could also map the news on an hourly basis — interpolating the rise and fall of events smoothly ..
Craig Mod: the thing with the animations is that, I believe, by watching repeated time lapses you’ll start to see “news rhythms” erupt ..
Craig Mod: which begs the questions — if you map these animations to sound, can you decern other patterns that you were missing visually?

Jeremy: what about some of the criticisms that you’re leaning towards datelines, and so stuff like the tsunami wasn’t represented properly?
Craig Mod: There are some events (like the tsunami) which appear after the day they happened .. one of the best and worst parts of Buzztracker is that it’s fully automated so if something doesn’t appear when it “should” that’s representative of the media in some ways
Craig Mod: The spain explosions last year are incredibly represented
Craig Mod: I think some — such as false results, or skewed distrobution in the wrong ways — could be corrected by simple human intervention .. Looking for, spotting these “errors” in calculation, and adding rules to fix them
Craig Mod: but at the same time, that takes away from a bit of the purity of the automation of Buzztracker .. it’s always about balance I suppose

Thanks, Craig.

How To Hoover Up Addresses

Maybe it’s just the summer heat but I get the feeling that, finally, people are focusing on software tools that really make working on a computer easier. Sure this has been the case for a while, but these companies seem to actually stick around long enough to make some money. So they have to be doing something right.

Take saving addresses, for example. It’s a simple concept: See a guy’s name and address in an email, on a website or in a document you’d like to save, and what do you have to do? Fiddly copying the text, and then, line by line, pasting it into Outlook or whatever. Yuck. It’s faster to go round to the guy’s house with your laptop, knock on his door and ask him to type in the details himself.

A few years’ back there was a great little company called Cognitive Root which had a program called Syncplicity, which tried to figure out from any text you copied what was the name, the address, the phone number etc, and copy it all into the right fields in your Palm Desktop. I raved about the product back in January 2001, which seems to have been enough to ensure it was consigned to the bin, since I can find no trace of the company or the product on any recent website. Sorry about that, guys.

Still, don’t despair: other companies have since taken up the banner. And they look like they’ll be around for a while. There’s Anagram, which does pretty much the same thing for Palm and Outlook, and, more importantly, has on its website a photo of a left-handed businesswoman not using a mousepad, chewing her glasses and staring wistfully into the middle distance having saved herself oodles of time using the product.

Then there’s AddressGrabber, which does something similar but also works with ACT!, GoldMine and stuff like that. I’ve fiddled with both AddressGrabber and Anagram and for my needs the latter works ($20) fine. But if you’re a serious address grabbing kind of dude, maybe you want to splash out ($70 to $250) for the former. Both work with