Tag Archives: keyword search

Media: Reducing Story Production Waste

In trying to change news to match the new realities of the Interwebs, media professionals are still somewhat stuck in old ways of doing things. One is to fail to address the massive waste in news production–or at least parts of it.

So what potential waste is there? Well, these are the obvious ones:

  • Gathering: Reporters/trips/stories per trip/matching other outlets
  • Editing: The number of people who look at a story before it is published/time a story takes to work through the system

I’m more interested, however, in the amount of waste from material generated. Think of it like this:

Inputs:

  • Story idea
  • Logistics (travel/communications/reporting tools)
  • Interviews, multimedia and other material generated

Outputs:

  • Story
  • Photo
  • ?Video

Wastage:

  • All content not used in story (some may be reused, eg photos, sidebars but rarely)
  • All content used that’s not reused/repurposed.

This seems to me to be extremely wasteful in an industry in so much pain. Any other industry wouldn’t just look to pare back on factors of production but to also minimize the waste generated.

Any journalist will know just how much we’re talking about. Say you interview five people for a story. Even a stock market report is going to involve five interviews of at least five minutes. At about 150 words a minute that’s nearly 4,000 words. The stock market report itself is going to be about 500 words, maybe 600. That’s a 3,600 words–say 2,500, allowing for the reporter’s questions, and some backchat–gone to waste. For 500 words produced we had to throw out 2,000.

Yes, I know it’s not a very scientific way of doing things, but you get my point. Most journalists only write down the quotes they need for the story, and many will delete the notes they’ve taken if they’re typing them on the screen in the same document they’re writing the story on. So all that material is wasted.

A good reporter will keep the good stuff, even if it’s not used in the story, and will be able to find it again. But I don’t know of any editorial system that helps them do that–say, by tagging or indexing the material–let alone to make that available to other reporters on the same beat.

This is where I think media needs to change most. It needs to assume that all material gathered by journalists, through interviews, research, even browsing, is potentially content. It needs to help journalists organise this material for research, but, more importantly to generate new content from.

Take this little nugget, for example, in a New York Times, story, Nokia Unveils a New Smartphone, but Not a Product of Its Microsoft Deal – NYTimes.com: The reporter writes of the interviewee, Nokia’s new chief executive Stephen Elop: ”During the interview, he used the words “innovate” or “innovation” 24 times.”

I really like that. It really captures something that quotes alone don’t. We would call it “interview metadata”–information about the interview that is not actual quotes or color but significant, nonetheless.

Whether the journalist decided to count them early on during the interview, or took such good notes a keyword search or manual count after was enough, or whether he transcribed the whole thing in his hotel room later, I don’t know. (A quibble: I would have put the length of the interview in that sentence, rather than an earlier one, because it lends the data some context. Or one could include the total number of words in the interview, or compare it with another word, such as “tradition” or something. Even better create a word cloud out of the whole interview.)(Update: here’s another good NYT use of metadata, this time the frequency of words in graduation speeches: Words Used in 40 Commencement Speeches – Class of 2011 – Interactive Feature – NYTimes.com)

The point? Elop is an executive, and he has a message. He wants to convey the message, and so he is using carefully chosen words to not only ensure they’re in any quote that’s used, but also to subliminally convey to the journalist the angle he hopes the journalist will adopt. By taking the interview metadata and presenting it separately, that objective, and strategy, will be well illustrated to the reader.

And, of course, you’ve reduced the story production wastage, or SPW, significantly.

Media can help this process by developing tools and offering services to maximise the usefulness of material gathered during research and interviews, and to reduce the time a journalist spends on marshalling this material.

Suggestions?

  • Transcription services, where journalists can send a recording and get the material back within the hour (or even as the interview is conducted, if the technology is available).
  • Push some of the content production to the journalist: let them experiment with wordclouds and other data visualization tools, not only to create end product but to explore the metadata of what they’ve produced.
  • Explore and provided content research and gathering tools (such as Evernote) to journalists so they don’t have to mess around too much to create stuff drawing on existing material they’ve gathered, for the story they’re working on, from previous research and interviews, and, hopefully, from that of colleagues.

A lot of my time training journalists these days is in these kinds of tools, and I’m always surprised at how little they are made use of. That needs to change if media is to find a way to make more use of the data it gathers in the process of creating stories.

The Phantom Threats We Face

This is a copy of my weekly Loose Wire Service column.

By Jeremy Wagstaff

We fear what we don’t know, even if it’s a guy in Shenzhen trying to make an honest living developing software that changes the background color of your mobile phone display.

Here’s what happened. I’ll save the lessons for the end of this piece.

A guy who prefers to go by the name Jackeey found a  niche for himself developing programs—usually called apps—for the Android cellphone operating system.

They were wallpaper applications—basically changing the background to the display.

That was until an online news site, VentureBeat, reported on July 28 that a security company, Lookout, had told a conference of security geeks that  that some downloadable applications to phones running the Android operating system would “collect a user’s browsing history, their text messages, the phone’s SIM card number and subscriber identification, voicemail phone number password” and send all this data to a website owned by someone in Shenzhen, China.

Yikes! Someone in China is listening to our conversations! Figuring out what we’re doing on our phone! Sending all this info to Shenzhen! Sound the alarum!

Word did indeed spread quickly. About 800 outlets covered the story, including mainstream publications like the Daily Telegraph and Fortune magazine: “Is your smart phone spying on you?” asked one TV station’s website.

Scary stuff.

Only it isn’t true. Firstly, VentureBeat had the story wrong: The applications in question only transmitted a portion of this data. No browsing history was transmitted, no text messages, no voicemail password.

VentureBeat corrected the story—sort of; the incorrect bits are crossed out, but there’s no big CORRECTION message across the top of the story—but the damage was done. Google suspended Jackeey’s apps. Everyone considered Jackeey evil and confirmed suspicions that a) Android was flakey on security and b) stuff from China was dodgy.

All kind of sad. Especially when you find that actually Jackeey himself is not exactly unreachable. A few keyword searches and his email address appears and, voila! he’s around to answer your questions. Very keen to, in fact, given the blogosphere has just ruined his life.

Here’s what he told me: He needed the user’s phone number and subscriber ID because people complained that when they change their phone they lose all their settings.

That’s it. That’s the only stuff that’s saved.

Needless to say he is somewhat miffed that no one tried to contact him before making the report public; nor had most of the bloggers and journalists who dissed his applications.

“I am just an Android developer,” he said. “I love wallpapers and I use different wallpaper every day. All I want is to make the greatest Android apps.”

Now of course he could be lying through his teeth, but I see no evidence in the Lookout report or anything that has appeared subsequently that seems to suggest the developer has done anything underhand. (The developer has posted some screenshots of his app’s download page which show that they do not request permission to access text message content, nor of browsing history.)

In fact, he seemed to be doing a pretty good job: His apps had been downloaded several million times. He declined to give his name, but acknowledged that he was behind both apps provided under the name Jackeey, and under the name iceskysl@1sters.

The story sort of ends happily. After investigating them Google has reinstated the apps to their app store and will issue a statement sometime soon. It told Jackeey in an email that “Our investigation has concluded that there’s no obvious malicious code in your apps, though the implementation accesses data that it doesn’t need to.”

VentureBeat hasn’t written an apology but they have acknowledged that: “The controversy grew in part because we incorrectly reported in our initial post that the app also sent your text messages and browser history to the website.”

For his part Jackeey is redesigning his apps to take into account Google’s suggestions. He points out that to do so will require him to have users set up an account and enter a password, which some users may be reluctant to do. And the Google suggestion is not entirely secure either.

Obviously this is all very unsatisfactory, in several ways.

Firstly, the journalism was a tad sloppy. No attempt was made to contact the developer of the app for comment before publishing—how would you feel if it was your livelihood on the line?—and the correction was no real correction at all.

Secondly, the internet doesn’t have a way to propagate corrections, so all the other websites that happily picked up the story didn’t update theirs to reflect the correction.

Thirdly, Google maybe should have contacted Jackeey before suspending the apps. It would have been kinder, and, given they’ve not found anything suspicious, the right thing to do.

Fourthly, us. We don’t come out of this well. We are somehow more ready to believe a story that includes a) security issues (which we don’t understand well) and b) China, where we’re perhaps used to hearing stories that fit a certain formula. Suspicious?

And lastly, perhaps we should look a little harder at the source of these reports.  We seem very quick to attribute suspicious behavior to someone we don’t know much about, in some scary far-off place, but less to those we do closer to home: Lookout’s main business, after all, is prominently displayed on their homepage: an application to, in its words, “protect yourself from mobile viruses and malware. Stop hackers in their tracks.”

So spare a thought for Jackeey. If you do a keyword search for him, the first hit is the story “’Suspicious’ Android wallpaper app nabs user data”, and links to 863 related articles. Below—a week after the hoo-ha, and after Google has sort of put things right–are headlines like: “Jackeey Wallpaper for Android steals your personal info”, “Your Rotten App, Jackeey Wallpaper” and “Jackeey steeling [sic] info on Android devices”.

In other words, anyone who checks out Jackeey’s wares on Google will find they don’t, well, check out.

I got back in touch with Jackeey to see how he’s holding up, a week after the storm broke. I’m in some pain, he says, “because mass negative press said that I steal users’ text messages, contacts and even passwords.” People have removed his applications from their phone, and people have been blasting him by email and instant messaging, calling him “thief”, “evil person” and other epithets.

“I am afraid that it will destroy my reputation and affect my livelihood forever,” he says.

I’m not surprised. We owe to folk like Jackeey to make apps for our phones, so we should treat him a little better.

The Google AdWord Vultures

It’s interesting to watch how Google’s ‘Sponsored Links’ capture aspects of the business process, in particular the plundering of customers from a company in trouble. Take CardSystems, for example, facing a class action suit, the loss of its main business and other indignities as a result of the theft of large amounts of credit card data from its vaults. Do a keyword search ‘cardsystems’ on Google and you get a stream of ads:

Cs

Let’s ignore the further indignity of two of the ads not being able to get the company’s name right. Where does Google draw the line on this kind of behaviour? Do they allow companies to place ads willy-nilly suggesting problems with their rivals and offering a solution? ‘Is Microsoft going bust? Switch to Apple while you can’ type thing.

Plaxo Etiquette: Moral High Ground Or Cheap Stunt?

Plaxo, the online contacts exchange that got some good, and bad, press two years back, is trying to brush up its members’ manners with some Plaxo Etiquette:

Each and every new technology has a learning curve as we figure out how to use it, and use it well. Remember when you’d frequently see people talking on their cell phone in a restaurant, or in the movie theater? And how many of those forwarded blonde or lawyer jokes were really funny?

Plaxo is committed to helping you become a better member of the digital world. Below you’ll find a few tips and suggestions on how to make the best use of Plaxo.

Not bad stuff, although some cynics might say it’s a few years too late. After all, one of the problems that its critics cited was the ease with which users could spam everyone in their Outlook address book, not considered a particularly polite thing to do in any community.

I’m not going to be cheap. It’s good that Plaxo is doing this, late or not. I did, however, feel the PR pitch that accompanied the announcement was a bit overly precious:

Plaxo, provider of an Internet service for updating and accessing contact information, is committed to helping its users be better members of the digital world. The company recently introduced Plaxo Etiquette (http://www.plaxo.com/privacy/manners) to guide members in the proper way to use the technology from the get-go. We challenge other providers of prevalent technologies to do the same.

Cynics, once again, might say that Plaxo was part of the address book spamming lapse in etiquette to start with two years ago, so suggesting it’s suddenly ‘committed to helping its users be better members of the digital world’ and that it feels it occupies such moral high ground it can ‘challenge other providers of prevalent technologies to do the same’ might be considered somewhat rich. I wouldn’t say that, of course; nor would I suggest this is a self-serving piece of publicity to raise the profile of a service that hasn’t been heard of — at least in a positive light — very much in recent months. (A keyword search for Plaxo of Google News throws up three references to the dangers associated with Plaxo and phishing, one to Plaxo and privacy and nine neutral references in passing.)

Google Blurb Search

Here’s another whacky trick that Google have quietly introduced, adding to the impression they are fast cementing their role as one-stop portal: Book searching. According to SearchEngineWatch (via the excellent TechDirt), Google Print is an experimental service that “indexes excerpts of popular books, blending the content from these works into regular Google search results”.

These excerpts are usually the blurb, for now. True to its apparent intention to make itself indispensable before it starts collecting cash, Google says book sellers pay nothing for links from these search results, and it is not benefiting if you make a purchase from one of these retailers. It’s likely that Google will eventually do what Amazon does already, namely offer full text searches of books, although these kind of searches will have to be crippled in some way to prevent users from downloading whole books online.

Can’t remember where I read this, but of course all this has wonderful side-effects for those of looking for something in a book we already own: So long as Amazon (or later, Google) have the book scanned, it would be quicker to do a keyword search there than to check the index, or leaf through the chapter list. Voila.

Column: the all in one gadget

Loose Wire — All-in-One Gadgets: Compact But No Cure-All: The Sony Ericsson P800 is an Internet-enabled PC, hand-phone, digital organizer and camera rolled into one; But some things are better kept separate

 
By Jeremy Wagstaff
 
from the 10 April 2003 edition of the Far Eastern Economic Review, (c) 2003, Dow Jones & Company, Inc.
If you’re anything like me, you hope the next gadget you buy will solve all the problems with your existing one — phone, palm-held device, lawnmower — only to find that in most cases, you’re forced to settle for something that may be better, but not necessarily in the way you imagined, or hoped. Call it Feature Disconnect.

Take my new hand-phone, for example. I needed something that didn’t keep switching off mid-call, where the keys didn’t stick, and which had some extra features such as a decent calendar, contacts list and whatnot. After much deliberation I settled for the Nokia 7650, a beast that combines camera, digital assistant and phone.

The Nokia 7650

Two weeks on, I like half the features and am somewhat disappointed over the other half, but in most cases the things I like about it are not the reasons I bought it. I’ve had to abandon synchronizing my data with Microsoft Outlook because the Nokia slows to a crawl with all my contacts aboard, while the short messaging (or SMS) feature, while comprehensive in terms of storing and displaying messages, is actually more fiddly than its predecessor. On the other hand, I’m addicted to taking pictures of people and linking the picture to their contact details, so on the rare occasions they call, their visage appears on the screen. Completely pointless, I know, and certainly not why I bought the thing, but it makes me happy.

I suspect similar problems with Sony Ericsson’s P800 (about $650). As I’m sure you know, Sony Ericsson is a trial marriage of Japanese electronics-giant Sony and Ericsson, the Swedish hand-phone manufacturer. They’ve been dabbling for a while in handsets and with their most recent model appear to have hit something near the jackpot. It looks a lot like a normal phone, but flip open the keypad and you get a screen the size of Hungary, an interface to die for and an almost fully fledged digital organizer. It’s a marvel of engineering, delightful to hold and look at, but sadly it’s still vulnerable to Feature Disconnect.

The Sony Ericsson P800

It’s like this. The P800 is out to replace your hand-phone and your personal digital assistant. It has handwriting recognition and will synchronize with Outlook and Lotus Notes; you can write and read e-mail and surf the Internet on it. Flip the keypad back into place and you have a normal phone that’s no larger than most existing hand-phones. Oh, and it takes pictures. For many folk it’s what they’ve been waiting for: a convergent device that means they can leave their Palm or PocketPC at home, as well as the digital camera. Lighter pockets all round. Out of the 100-or-so user reviews I read, only a handful said bad things about the P800.

My experience was different: While the handwriting recognition (scrawling letters on the screen which are then interpreted by the phone into digital text) is no better or worse than its peers, it’s one thing to tap away in your spare time and another to try to enter notes or phone numbers while you’re on the road taking a call from the boss. Errors creep in and frustration mounts. The software aboard the P800 is a departure — it’s neither Palm- nor Microsoft-related, instead drawing on the Symbian platform — and is nicely designed, but has its quirks. There are some treats — tap on a phone number and a menu appears, allowing you to phone, SMS or add the number to your contacts directly.

But there are also some oddities — I could not find, using a keyword search, any of the folk I had added to the contacts directory, and was horrified to discover that the phone does not support the “predictive text” SMS function used by everyone and his dog (predictive text anticipates what word you’re trying to tap on the keypad, allowing you to press keypads once to form words instead of several times). To not include this is, in my view, like selling a car without a steering wheel. My verdict: The P800 is a very impressive device but it’s too limited to replace my Palm — making it just a very expensive phone, albeit a full-featured one.

The problem as I see it is this: As all these gadgets get better, we demand more out of them. Then we want all those features in one device. Seeing the P800 — the closest anyone’s come to an all-in-one gadget — I can’t help wondering whether we’d be better off keeping some things separate. With a keyboard and Bluetooth, today’s Palm or PocketPC can, under certain conditions, do a very good job of mimicking a laptop, something that wasn’t really intended when they first appeared in the mid 1990s. Hand-phones now are messaging devices — transmitting not just voice, but messages, pictures and whatnot, storing music and taking photos — something that certainly wasn’t envisaged with the launch of their brick-sized ancestors in the early 1980s. All these features, in my view, make it less likely — and indeed, less preferable — to have an all-in-one device. So long as they communicate well with one another, I think manufacturers should focus on combinations of devices, allowing us users to mix and match according to our whim, however quirky. That way we might get what we want and not lose the features we like every time we upgrade.

Now keep still while I take a picture of you in case you call.

Loose Wire — Click Here

Loose Wire — Click Here to Read Summary

By Jeremy Wagstaff
from the 21 February 2002 edition of the Far Eastern Economic Review, (c) 2003, Dow Jones & Company, Inc.

If you work for a corporation, institution or any set-up which considers a vision statement to be worthy of its resources, chances are you’ll be required to file regular reports on your comings, goings and sitting-still-and-doing-nothing sessions. And the chances are that no one will ever read these documents top to bottom. In fact, chances are that no one will read them at all. Heck, you probably don’t even read them. But they have to be done, or someone will notice and fire you.

But where does all this stuff go? In the old days we’d say with confidence, “landfill,” but in the digital age, no such luck. It all gets stored on some hard disk somewhere, no easier to find than its hard-copy forebears. Luckily, no one shows a pressing urge to want to find it, but what happens if they do? The sad truth is that all these zillions of e-mails, Word documents, Acrobat files, PowerPoint presentations and spreadsheets we produce don’t build us a supply of wisdom; they just get lost. In the lingo of the information game, it’s called unstructured data and unlike its rich cousin, structured data, which gets sifted by sophisticated programs wearing tin hats called data miners, it sits idle and largely inaccessible, unnoticed.

But there are signs that software developers are taking a closer look at this forgotten corner of the information superhighway and figuring out ways of imposing order on this unruly mass.


Logik, from Coredge Software Inc. (www.coredge.com), will take a document, or a whole directory, or hard drive, and sift — or parse — the contents, extracting the most important phrases, or themes as Logik calls them. Logik also generates a summary of the document. It does all this remarkably well, giving you a sense of the document in question along with a list of themes, from names and concepts to phrases like “vision statement” — all in less time than it takes to say: “What exactly is a vision statement and why do we need one?”

This process is great for handling large numbers of documents that you might need to retrieve at some point, but may not have the time to read all the way through. A keyword search for a phrase or theme will throw up a list of files that include that phrase. And if you select one of those documents you get a summary. Logik will also translate documents between major European languages and Japanese. I was impressed by the intuitive, uncluttered feel of this software.

But while automatic summarizing is a great concept which has come a long way in recent years, it’s by no means the main function users want to see in programs that organize their documents for them. To me the most important part of the process is a simple one: Can I find the document I’m looking for quickly, and can I view it immediately? While users can view the original document in Logik, it opens in a new window, making it less seamless than the rest of the program’s functions.

Document Search

For this kind of feature — finding quickly and viewing — you need Enfish Corp’s (www.enfish.com) Find, which indexes your hard drive and lets you find anything from a single word to a complex Boolean string quickly. Another program that offers a similar feature is 80-20 Software’s Retriever (www.80-20.com/products/retriever/) though at present it doesn’t let you preview the whole document (future versions will).

For software that does straight summarizing, check out Copernic Technologies’ Summarizer (www.copernic.com/products/summarizer), which does a great job of abridging anything on the fly, whether it’s a Web page, a Microsoft document or next door’s cat.

These programs make digging up any document you mislaid — or keeping track of colleagues’ documents — a whole lot easier. None of them comes cheap, however-Retriever is $50, Summarizer is $60 and Enfish Find is $70, while the standard version of Logik sells for $150. But to me that’s a good thing: These companies are aiming at a more discerning market with deeper pockets — in fact at exactly the sort of guys who spend their days writing reports that their bosses will never read.