The Big Boys’ Mea Culpas

I find it interesting that companies can get things so wrong. News Corp just sold off Myspace for a fraction of its original price today, effectively admitting it didn’t get social media.

Microsoft famously came late to the table with the Internet, and then has been late to more or less every party since. It’s now come out with Microsoft 365, an awful name for a product that is basically an admission that Google Docs is good enough for most people, and that Microsoft Office is largely toast (an incorrect assumption, I reckon; I still can’t do without it.)

Then we have Google. Google has made a surprising number of missteps: Buzz, Wave (dumping it as much as hyping it, in my view.) Now, with the launch of Google+, they’re also acknowledging that they got the Web wrong: Instead of seeing it as a network, they saw it as a library. This from AllThingsD’s Liz Gannes, who asked Vic Gundotra why he and Bradley Horowitz had spent so much of the launch self-flagellating about why Google was so late to the social media dance:

Google Opens Up About Social Ambitions on Google+ Launch Day – Liz Gannes – Social – AllThingsD: “Gundotra: It’s just sincere. I don’t think it’s anything more than that. We do have a mission that we’ve been working on for a long time: organizing the world’s information and making it universally accessible and available. And when you look at the web today it’s obvious it’s not just about pages, it’s about people. It’s not just about information, it’s about what individuals are doing. So I think we have to do that in a coherent way. We think there’s just tremendous room to do great stuff.”

Well put: Google really didn’t get the the web. And probably still doesn’t; one might argue that the algorithms they use to rank pages are having to be constantly updated because they don’t really reflect the dynamic nature of most web pages these days. I am not sure what I mean by that so I’ll leave it for now.

Finally, what might one ask about Apple? Where have they gone wrong? MobileMe is a pretty small misstep. Quibbles with OSX are relatively small: I get the sense that a lot of the things wrong with the OS aren’t because they keep tweaking things (the usual complaint from Windows users) but that there’s a stubbornness about not changing things: A weak file explorer (Finder), an inability to resize windows except from one corner, a confusing division of function between dock icons, menu bar icons, menu bar menus, in-window menus etc etc…

But apart from those gripes with the Mac OS, you gotta hand it to Apple. No big mea culpas, at least in the past decade.

Media: Reducing Story Production Waste

In trying to change news to match the new realities of the Interwebs, media professionals are still somewhat stuck in old ways of doing things. One is to fail to address the massive waste in news production–or at least parts of it.

So what potential waste is there? Well, these are the obvious ones:

  • Gathering: Reporters/trips/stories per trip/matching other outlets
  • Editing: The number of people who look at a story before it is published/time a story takes to work through the system

I’m more interested, however, in the amount of waste from material generated. Think of it like this:

Inputs:

  • Story idea
  • Logistics (travel/communications/reporting tools)
  • Interviews, multimedia and other material generated

Outputs:

  • Story
  • Photo
  • ?Video

Wastage:

  • All content not used in story (some may be reused, eg photos, sidebars but rarely)
  • All content used that’s not reused/repurposed.

This seems to me to be extremely wasteful in an industry in so much pain. Any other industry wouldn’t just look to pare back on factors of production but to also minimize the waste generated.

Any journalist will know just how much we’re talking about. Say you interview five people for a story. Even a stock market report is going to involve five interviews of at least five minutes. At about 150 words a minute that’s nearly 4,000 words. The stock market report itself is going to be about 500 words, maybe 600. That’s a 3,600 words–say 2,500, allowing for the reporter’s questions, and some backchat–gone to waste. For 500 words produced we had to throw out 2,000.

Yes, I know it’s not a very scientific way of doing things, but you get my point. Most journalists only write down the quotes they need for the story, and many will delete the notes they’ve taken if they’re typing them on the screen in the same document they’re writing the story on. So all that material is wasted.

A good reporter will keep the good stuff, even if it’s not used in the story, and will be able to find it again. But I don’t know of any editorial system that helps them do that–say, by tagging or indexing the material–let alone to make that available to other reporters on the same beat.

This is where I think media needs to change most. It needs to assume that all material gathered by journalists, through interviews, research, even browsing, is potentially content. It needs to help journalists organise this material for research, but, more importantly to generate new content from.

Take this little nugget, for example, in a New York Times, story, Nokia Unveils a New Smartphone, but Not a Product of Its Microsoft Deal – NYTimes.com: The reporter writes of the interviewee, Nokia’s new chief executive Stephen Elop: ”During the interview, he used the words “innovate” or “innovation” 24 times.”

I really like that. It really captures something that quotes alone don’t. We would call it “interview metadata”–information about the interview that is not actual quotes or color but significant, nonetheless.

Whether the journalist decided to count them early on during the interview, or took such good notes a keyword search or manual count after was enough, or whether he transcribed the whole thing in his hotel room later, I don’t know. (A quibble: I would have put the length of the interview in that sentence, rather than an earlier one, because it lends the data some context. Or one could include the total number of words in the interview, or compare it with another word, such as “tradition” or something. Even better create a word cloud out of the whole interview.)(Update: here’s another good NYT use of metadata, this time the frequency of words in graduation speeches: Words Used in 40 Commencement Speeches – Class of 2011 – Interactive Feature – NYTimes.com)

The point? Elop is an executive, and he has a message. He wants to convey the message, and so he is using carefully chosen words to not only ensure they’re in any quote that’s used, but also to subliminally convey to the journalist the angle he hopes the journalist will adopt. By taking the interview metadata and presenting it separately, that objective, and strategy, will be well illustrated to the reader.

And, of course, you’ve reduced the story production wastage, or SPW, significantly.

Media can help this process by developing tools and offering services to maximise the usefulness of material gathered during research and interviews, and to reduce the time a journalist spends on marshalling this material.

Suggestions?

  • Transcription services, where journalists can send a recording and get the material back within the hour (or even as the interview is conducted, if the technology is available).
  • Push some of the content production to the journalist: let them experiment with wordclouds and other data visualization tools, not only to create end product but to explore the metadata of what they’ve produced.
  • Explore and provided content research and gathering tools (such as Evernote) to journalists so they don’t have to mess around too much to create stuff drawing on existing material they’ve gathered, for the story they’re working on, from previous research and interviews, and, hopefully, from that of colleagues.

A lot of my time training journalists these days is in these kinds of tools, and I’m always surprised at how little they are made use of. That needs to change if media is to find a way to make more use of the data it gathers in the process of creating stories.

Getting Paid for Doing Bad Things

I have recently received half a dozen offers of placing links in my blogs to reputable companies’ websites.

Think of it as product placement for the Internet. It’s been around a while, but I just figured out how it’s done, and it made me realise that the early dreams of a blogging utopia on the web are pretty much dead.

Here’s how this kind of product placement works. If I can persuade you to link to my product page in your blog, then my product will appear more popular and rise up Google’s search results accordingly. Simple.

An ad wouldn’t work. Google would see it was an ad and discount it. So one increasingly popular approach is for you to pay me to include a link in my blog. I mean, right in it: not as a link, or a ‘sponsored by’, but as a sentence, embedded, as it were, inside my copy.

I had some problem getting my head around this, so I’ll walk you through it. I add a sentence into my blog, and then turn one of the words in it into a link to the company’s website. For my trouble I get $150. The company, if it gets enough people like me to do this, will see their web site rise up through the Google ranks.

This is what the Internet, and blogs, have become. A somewhat seedy enterprise where companies–and we’re talking reputable companies here–hire ad companies to hunt out people like me with blogs that are sufficiently popular, and vaguely related to their line of business, to insert a sentence and a link.

If you’re not sure what’s wrong with this, I’ll tell you.

First off, it’s dodgy. If Google finds out about it it will not only discount the link in its calculations, but ban the website–my blog, in other words–from its index. Google doesn’t like any kind of mischief like this because it corrupts their search.

That’s why a) the blog needs to look vaguely related and b) it can’t just be any old sentence that includes the link. Google’s computers are sharp enough to spot nonsense.

That’s why kosher links are so valuable, and why there’s business in trying to persuade bloggers like me to break Google’s rules. If I get banned, my dreams of a profitable web business are gone. For the company and ad firm: nothing.

Second, it’s dodgy. It works on the assumption that all blog content is basically hack work and the people who write it are for sale. I think that’s why I loathe it so much. It clearly works: When I got back to one company that approached me, I was told the client’s request book had already been filled.

With every mercenary link sold they devalue the web.The only thing that might make my content valuable is that it’s authentic. It’s me. If I say I like something, I’m answerable for that. Not that people drop by to berate me much, but the principle is exactly the same as a journalistic one: Your byline is your bond and not a checkbook.

Gay Lesbian Syrian Blogger? Or a Bearded American from Edinburgh?

Here’s a cautionary tale about how hard it is to verify whether someone is who they say they are:

Syrian lesbian blogger is revealed conclusively to be a married man

Tom MacMaster’s wife has confirmed in an email to the Guardian that he is the real identity behind the Gay Girl in Damascus blog

Tom Mcmaster

Syrian lesbian blogger has been revealed to be Tom MacMaster, an American based in Scotland. Public domain

The mysterious identity of a young Arab lesbian blogger who was apparently kidnapped last week in Syria has been revealed conclusively to be a hoax. The blogs were written by not by a gay girl in Damascus, but a middle-aged American man based in Scotland.

The Guardian, frankly, has not covered itself in glory on this issue. The story itself makes no mention of the fact that the paper itself was duped. It was, after all, bloggers did the detective work that uncovered the hoax, not they. There’s this mea culpa, buried deep in a secondary story but it doesn’t apologise for misleading readers for more than a month:

The Guardian did not remove all the pictures until 6pm on Wednesday 8 June, 27 hours after Jelena Lecic first called the Guardian. It took too long for this to happen, for which we should apologise (see today’s Corrections and clarifications). The mitigating factors are that we first acted within four hours but compounded the error by putting up another wrong picture, albeit one that had been up on our website for a month, was unchallenged and was thought to have come directly from “Amina”. We know for a fact that the two pictures are of Jelena Lecic, but we didn’t know much else until thisevening. But we do know that when using social media – as we will continue to do as part of our journalism – the Guardian will have to redouble its efforts in establishing not just methods of verification, but of signalling to the reader the level of verification we think we can reasonably claim.

And even The Guardian hasn’t yet corrected itself: This piece is still up, uncorrected, and illustrating some more journalistic traits by not sourcing the story or expressing any “unconfirmed” thoughts:

image

The only suggestion that something is amiss is this at the end:

• This article was amended on 7 June 2011 and again on 8 June 2011 after complaints that photographs accompanying articles relating to Amina Araf showed someone other than the abducted blogger. The photographs have been removed pending investigation into the origins of the photographs and other matters relating to the blog.

Bottom line. Journalists have got to be smarter: smarter about the old things, such as dual sourcing, being sceptical about everything (a lesbian blogger in Damascus posting pictures of herself and using her real name? Even the author of the Guardian pieces was using a pseudonym—itself a no-no) and doing some basic legwork in trying to authenticate the person. And smart about new stuff: using the same tools the bloggers themselves used in exploring the real person behind it (those people could be forgiven for not having done this earlier: they, after all, are a community and accepted ‘her’ as one would in such a community.)

So what are those ‘new’ tools?

  • basic search. Do we know everything about this person? What kind of online footprint did they have before this all happened?
  • check photos’ origin. Not always easy, but worth doing. File names. Captions. Check out whether there’s any data hidden in the image. Image date.
  • IP addresses of emails and other communications.
  • Website/blog registration. Where? By whom?

These new tools need to be learned by journalists. And we need to learn them quickly.

We also need to find better ways to correct things when we get them wrong, and, frankly, to say sorry. Here are some other outlets that fell for it and have yet at the time of writing to either apologise or correct their stories:

WaPo: Elizabeth Flock, “‘Gay girl in Damascus’ Syrian blogger allegedly kidnapped,” June 7, 2011

CNN: “Will gays be ‘sacrificial lambs’ in Arab Spring?”

AP: Syrian-American gay blogger missing in Damascus – Timesonline.com- World-

NYT (since corrected, sort of, but the comments are intriguing. Readers are gullible too, although they might reasonably feel aggrieved that the NYT didn’t do its job in checking the facts): After Report of Disappearance, Questions About Syrian-American Blogger – NYTimes.com

More links:

Open door- The authentication of anonymous bloggers – Comment is free – The Guardian

Gay Girl in Damascus blog extracts- am I crazy- Maybe – World news – The Guardian

Syrian blogger Amina Abdallah kidnapped by armed men (example of The Guardian duped)

Wikipedia: Amina Abdallah Araf al Omari – Wikipedia, the free encyclopedia

Podcast: The IMF’s Bad Dream

(Not tech related, this, so please skip if the IMF and Indonesia don’t float your boat. The BBC World Service Business Daily version of my piece on the IMF’s role in the Asian financial crisis of 1997/8 .  (The Business Daily podcast is here.)  

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

To listen to Business Daily on the radio, tune into BBC World Service at the following times, or click here.

Australasia: Mon-Fri 0141*, 0741

East Asia: Mon-Fri 0041, 1441 
South Asia: Tue-Fri 0141*, Mon-Fri 0741 
East Africa: Mon-Fri 1941 
West Africa: Mon-Fri 1541* 
Middle East: Mon-Fri 0141*, 1141* 
Europe: Mon-Fri 0741, 2132 
Americas: Tue-Fri 0141*, Mon-Fri 0741, 1041, 2132

Thanks to the BBC for allowing me to reproduce it as a podcast.

Libya’s Stuxnet?

A group of security professionals who have good credentials and strong links to the U.S. government have outlined a Stuxnet-type attack on Libyan infrastructure, according to a document released this week. But is the group outlining risks to regional stability, or is it advocating a cyber attack on Muammar Gadhafi?

The document, Project Cyber Dawn (PDF), was released on May 28 2011 by CSFI – the Cyber Security Forum Initiative, which describes itself as

non-profit organization headquartered in Omaha, NE and in Washington DC with a mission “to provide Cyber Warfare awareness, guidance, and security solutions through collaboration, education, volunteer work, and training to assist the US Government, US Military, Commercial Interests, and International Partners.”

CSFI now numbers about 7,500 members and an active LinkedIn forum.

To be clear, the document does not advocate anything. It merely highlights vulnerabilities, and details scenarios. It concludes, for example:

CSFI recommends the United States of America, its allies and international partners take the necessary steps toward helping normalizing Libya‘s cyber domain as a way to minimize possible future social and economic disruptions taking place through the Internet.

But before that it does say:

A cyber-attack would be among the easiest and most direct means to initially inject into the systems if unable to gain physical engineering attacks against the facility. Numerous client-side attack vectors exist that support payloads capable of compromising SCADA application platforms.

Elsewhere it says:

The area most vulnerable to a cyber-attack, which could impact not only the Libyan‘s prime source of income, but also the primary source of energy to the country, would be a focused attack on their petroleum refining facilities. Without refined products, it is difficult to fuel the trucks, tanks and planes needed to wage any effective war campaign.

The document itself is definitely worth a read; it doesn’t just focus on the cyberweapon side of things. And complicating matters is that one of the contributors to the report, a company called Unveillance, was hacked by a group called LulzSec around the time that the report was being finished. It’s not clear whether this affected release of the report.

Emails stolen from Unveillance and posted online by LulzSec indicate that two versions of the report were planned: one public one, linked to above, and one that would “go to staffers in the White House.” In another email a correspondent mentions an imminent briefing for Department of Defense officials on the report.

The only difference between the two reports that I can find are that the names of some SCADA equipment in Libya have been blacked out in the public version. The reports were being finalized when the hack took place–apparently in the second half of May.

Other commentators have suggested that we seem to have a group of security researchers and companies linked to the U.S. government apparently advocating what the U.S. government has, in its own report International Strategy for Cyberspace released May 17, would define as an act of cyberwar.

I guess I’m surprised by something else: That we have come, within a few short months, from thinking as Stuxnet as an outlier, as a sobering and somewhat shocking wake-up call to the power of the Internet as a vector for taking out supposedly resilient and well-defended machinery to having a public document airily discussing the exact same thing, only this time against non-nuclear infrastructure.

(The irony probably won’t escape some people that, according to a report in the New York Times in January, it was surrendered Libyan equipment that was used to test the effectiveness of Stuxnet before it was launched. I’m yet to be convinced that that was true, but it seems to be conventional wisdom these days.)

Frankly, I think we have to be really careful how we go about discussing these kinds of things. Yes, everything is at arm’s length in the sense that just because bodies such as CSFI may have photos of generals on their web-page, and their members talk about their reports going to the White House, doesn’t mean that their advice is snapped up.

But we’re at an odd point in the evolution of cyberwar presently, and I don’t think we have really come to terms with what we can do, what others can do, and the ramifications of that. Advocating taking out Libyan infrastructure with Stuxnet 2.0 may sound good, but it’s a road we need to think carefully about.

The Gmail Phish: Why Publicize, and Why Now?

This Google Gmail phishing case has gotten quite a bit of attention, so I thought I’d throw in my two cents’ worth. (These are notes I collated for a segment I did for Al Jazeera earlier today. I didn’t do a particularly good job of getting these points across, and some of the stuff came in after it was done. )

Google says the attack appears to originate from Jinan, but doesn’t offer evidence to support that. I think it would be good if they did. Jinan is the capital of Shandong Province, but it’s also a military region and one of at least six where the PLA has one of its technical reconnaissance bureaus. These are responsible for, among other things, exploitation of foreign networks, which might include this kind of thing. The city is also where the Lanxiang Vocational School is based, which was linked to the December 2009 attacks on Google’s back end systems. That also targeted human rights activists. Lanxiang has denied any involvement the 2009 attacks.

I’d be very surprised if this kind of thing wasn’t going on all the time. And I’m very surprised that senior government officials from the U.S., Korea and elsewhere are supposedly using something like Gmail. There are more secure ways to communicate out there. I think it’s worth pointing out that this particular attack was first identified by Mila Parkour, a researcher, back in February. Screenshots on her blog suggest that at least three U.S. government entities were targeted.

I asked her what she thought of the release of the news now, four months later. Does this mean, I asked, that it took Google a while to figure it out?

As for any other vendor, investigations take time especially if they do not wish to alert the actors and make sure they shut down all the suspicious accounts.

And why, I asked, are they making it public now?

I think it is great they took time to unravel and find more victims and try to trace it. Looks like they exhausted all the leads and found out as much as they could to address it before going public . It has been three months and considering that hundreds of victims [are] involved, it is not too long.

This is not the first time that Google and other email accounts have been hacked in this way, and it’s probably not the last. It’s part of a much bigger battle going on. Well, two: one pits China–who are almost certainly behind it, or at least the ultimate beneficiaries of any data stolen, against regional and other rivals–and the other is Google making these things public. For Google it’s a chance to point out the kind of pressures it and other companies are under in China. Google in January 2010 said it and other companies had been under attack using tricks that exploited vulnerabilities in Google’s network to gain unauthorized access.

Google says it went public because it wants to keep its users safe. This from Myriam Boublil, Head of Communications & Public Affairs at Google Southeast Asia:

“We think users should be aware of the disturbing campaign we’ve uncovered to collect user passwords and monitor user email.  Our focus now is on protecting our users and making sure everyone knows how to stay safe online”

This  attack is not particularly sophisticated, but it involves what is called spear phishing, which does involve quite extensive social engineering techniques and reveals the object of the attacker’s interest is not random, but very, very specific. If you judge a perpetrator of a crime by their victim, you don’t have to be a rocket scientist to figure out who is the ultimate recipient of any intelligence gathered.