Using Google to Predict the Future

Elegantly simple proposal to measure economic confidence in The Economist’s search for other quirky indicators: searches in the U.S. on Google for “gold price” in the piece Alternative indicators: Behind the bald figures

But the hottest tip came from Edward Ritchie, an investment analyst in London. He tracks Google searches for the “gold price” as an indicator of economic confidence. This does not follow the gold price itself. For example, during most of 2008 when the world’s financial system was melting down, the gold price tumbled yet the number of searches soared. The number of gold-price searches shoots up when American consumer confidence dives and subsides when households perk up again (see chart). That makes it a handy device for spotting turning-points in economic confidence, with the added advantage that the data are available earlier than for conventional survey-based figures. Worryingly, the number of searches has recently vaulted above its 2008 peak, signalling the possibility of a double dip.”

Here’s the graph:

I’m a big fan of using Google search to measure, track and predict things. A few of my previous posts on the matter. And no, I’ve not made any money so far out of this crystal ball.

How To Use Google To Get Round Super Injunctions

Technoratis Decline, Death of Blogging?

Googling the Tsunami

Googles Suicide Watch: where I googled the word “suicide”

Has Quora Peaked?

Fail, Seinfeld and Tina Fey: A Zeitgeist

The Financial Crisis in Charts

Hoodiephobia, Or We Don’t Lie to Google

And this one from 2006: Mapping Trends With Google

Visualizing England’s Woes

I hope I’m proved wrong in this case, but this is a visualization that does what any great visualisation should: it lets you find your own story. In my case I’m convinced that England’s football woes lie in the fact that not only do foreigners squeeze the natural wellspring of talent in the domestic game, but that those English players that do thrive have so little experience of any other leagues—save a few games a season in European competitions—that they’re shorn of any real breadth to their play.

Here’s a chart that illustrates these two facts brilliantly. The first illustrates how many other countries’ squads have players in the English game (I don’t need to explain, but the squads competing are at the top and the country leagues are at the bottom):

image

And here’s the other way: what leagues the English team play in.

image

Yes, one. (Three in 2006, two in 2002.)

The graphic, by the way, comes from Brazil’s Estadão, and their data goes back to 1994. I don’t speak a word of Portuguese but it’s intuitive and telling. Good stuff. Now let’s hope I’m proved wrong and the English team somehow scrape through.

Hoodiephobia, Or We Don’t Lie to Google

Boris johnson the knight

Does what we search for online reflect our fears?

There’s a growing obsession in the UK, it would seem, with ‘hoodies’—young people who wear sports clothing with hoods who maraud in gangs. Michael Caine has just starred in a movie about them (well, a revenge fantasy about them.) This Guardian piece explores the movie-making potential of this phenomenon.

Recently a female documentary film maker was saved from a group of iron bar-wielding “feral girls” by the bike-riding mayor of London (I’ve always wanted to write the headline for the story).

So is this “growing fear” reflected online?

Well, yes, it is.

Here’s what a graph of British people searching for ‘hoodies’ looks like:

image

As you can see, it’s been a growing interest, more than doubling in the past five years.

But it’s also showing a weird seasonal element. Interest drops off in the summer months, and then rises towards the end of the year. Every year for the past five years, searches have peaked in either December or November. The lowest point each year is June or July.

I don’t know why that is. One guess would be that in the summer attacks tail off. It would be interesting to see if there’s any correlation there with the actual figures on attacks. (Update: Commenters have rightly pointed out that the seasonal interest probably has more to do with online shoppers. Thanks, and sorry for not thinking of this.)

The Guardian piece quotes research by the group Women in Journalism back in March as finding that, among other things, 79% of adults are more wary of teenage boys than they were a year ago, and that the most commonly used descriptions of such boys in the UK press were ‘yobs’ and ‘thugs’ followed by ‘sick’, ‘feral’, ‘hoodies’ and ‘louts’ (PDF version of the report is here.)

Online, however, the trend is clearer: ‘Hoodie’ (light blue) is the preferred search term, and has been since late 2006, replacing the ‘thug’ and ‘scum’ of the mid 2000s:

image

I don’t know whether this is meaningful, but another word used to describe this perceived underclass of British use is ‘chav’, a term of obscure origin. Compare searches for the words ‘chav’ and ‘hoodie’ and you see this:

image

Clearly the word ‘chav’ (in red) was most popular—or one that people were hearing but not familiar with, and so needed to look it up—in late 2004. It has been in decline since then and has indeed been overtaken by ‘hoodie’ (in blue):

image

I don’t know whether this is meaningful or not. Wikipedia cites ‘chav’ as common parlance by 2004 (unfortunately Google’s data does not go further back than that, but the rise in 2004 is clear.)

I tend to believe that Google searches are as revealing as anything else about what people are interested in, or worried about—indeed more so than surveys, because people don’t lie to Google.

The Heatline of a Story

Google, apparently prodded by the ground covered by twitter news, has introduced a feature on its Google News search results that indicates what one might call the ‘heat’ of a story—how many sources are covering it over time:

image

As with Google Search Trends, the stories below the chart are linked to the graph via letters (although one can’t click on the letters.)

The chart appears to the right of any news search:

image

I think it’s clever, and a good way of merging two different Google services (and a third: the images in the bottom right hand corner.)

A note at the bottom explains the placement of stories on the graph:

The selection and placement of stories on this page were determined automatically by a computer program.

The time or date displayed (including in the Timeline of Articles feature) reflects when an article was added to or updated in Google News.

The example above, concerning phone tapping in the UK, indicates that things have quietened down a bit, although that could have more to do with it being a weekend than anything else.

I would imagine this kind of thing would be useful, too, for news organisations to let readers navigate big stories. The sheer number of stories on one particular issue make it hard for users to find the most relevant ones, or to be able to see where that story sits in their coverage timeline.

Google and History

image

I had gotten excited about Google’s timeline search before, but hadn’t seen this: Google is mining not just text for the dates of more recent stuff, but everything, stretching back into the mists of time, culled from Google Books:

The result is an odd but interesting automatically generated history of whatever you’re looking for.

In this case, I was looking for “cleft stick”. This is what appeared:

image

And the first few were all about how women found to be disrespectful, swearing, reveling or other forms of subversion had their tongues inserted into a cleft stick—a stick with the end split, and the tongue inserted:

image

The sources are varied, revealing a fascinating brutality and harrassment of women which went on for years:

In 1636, Elisabeth Aplegate was proclaimed guilty of the crime of swearing and reveling, and was required to stand in public with her tongue in a cleft stick.

1638 – The calmness with which even cultivated men then viewed the public whipping of women appears from the record by Governor Winthrop of the punishment of Mrs. Oliver in 1638. She was a woman of good character, but differed violently with the magistrates as to religious The calmness with which even cultivated men then viewed the public whipping of women appears from the record by Governor Winthrop of the punishment of Mrs. Oliver in 1638. She was a woman of good character, but differed violently with the magistrates as to religious matters, for which she was reproved, and finally sentenced to have her tongue put in a cleft stick, and then to be whipped.

This is clearly where the term “caught in a cleft stick” comes from. But not, probably, exactly what we mean when we say it.

Google’s Suicide Watch

image

I don’t really know what to make of this, but I occasionally trawl Google Search Trends/Insights to see what people are looking for, and whether they’re changing much over the past few years.

This seems to me to be as good an indicator of things as anything else.

I did it back in 2005 with Web 2.0, the tsunami,the economic crisis and seinfeld and tina fey.

But how about this one: the rise and fall of the search for “commit suicide painlessly”: things had been pretty flat since 2004 and then suddenly, over a period of three or four months from October 2008 to March 2009, the index goes from about 18 to 100:

image

It’s not good to read too much into Google Insights for Search, but I reckon there’s some interesting stuff in here. For one thing, the spike is a real one. That’s no blip.

(I should point out that these figures are relative. What Google does is to take the highest point—the largest volume of searches for that term since they started saving data in 2004, and then work out the volume in relation to that.)

Secondly, by mid April things on a global scale return, more or less, to where they had been in August 2008, before the crisis hit:

image

But if you look at individual countries, the picture is more complex:

In the U.S., where the search term rose from a relatively low base (actually it shows up as zero, meaning not enough data) it rises to 100, and then falls back by April to around 20. Only in the past few weeks does it seem to have returned to where it was to start with:

image

Look at the UK, by comparison, and we’re not there yet: From zero it rose—a week or so earlier, apparently to 100 by January, and then dropped, but only to around 40. It’s now around 35:

image

In other words, if one could take this data literally, the British are still very depressed and are still likely to be exploring ways of committing suicide. That’s pretty scary.

By the way, if you take these figures and compare them with the official UK statistics [PDF], they don’t tell you a lot. Brits have been killing themselves less since the late 1990s (though without figures from 2008 until now):

image

This pretty much dovetails with the Google results, 2004-9

image

PS I should point out that I used the term above because, having searched for “how to commit suicide” on the Google Trends page, I noticed that “commit suicide painlessly” was a popular search, rising 190%. Confusingly, “how to commit suicide” has, as a search been trending downward since 2004:

image

PPS Google’s nonprofit arm does use its data for this kind of thing, at least in the area of flu. It now carries data on Australia, New Zealand, Mexico and the U.S.:

image

The Financial Crisis in Charts

Thought I’d offer a brief history of the financial crisis as seen through Google Insights, which measures the popularity of a search term over time.

image

Interest in the word subprime spiked a couple of times in 2007 (above) before we figured out it was all about toxic debts (below):

image

and credit crunches:

image

Then we realised suddenly we had to learn a bit more about Freddie Mac and Fannie Mae:

image

and even basic terms like liquidity:

image

Useful information. And it wasn’t just an economics lesson. We had to gen up on countries that we had recently given little attention to, like Iceland:

image

Although it’s worth keeping it all in perspective. Search for the word meltdown, a commonly used term to capture the excitement of the past few weeks, and you get this. Clearly rising interest, but that spike in 2005? It’s linked to Ice Age: The Meltdown, which grossed $70 million at the box office in its debut week:

image

Hollywood still trumps global financial disaster, I guess.

Fail, Seinfeld and Tina Fey: A Zeitgeist

I use Google Insights quite a bit—I find it a very useful way to measure interest in topics. Here’s one I keyed in just for the hell of it. Red is the word success and blue is the word fail. The chart covers from 2004 to today:

image

What seems to have happened is a surge of interest in the word fail relative to the word success.

To the point where, in the past week or two, it’s become a more popular word to include in search terms than the word success, for the first time in four years.

Just to magnify that last bit:

image

What does this mean? Probably not very much. But I found it intriguing. Are we now more interested in failure than success, or is it just this ridiculous new fascination with the word FAIL?

I think these Google searches reveal a lot more than we’re really giving them credit for. If nothing else, I believe they offer a pretty good idea of a celebrity’s career trajectory.

Take these clowns, for example. Here’s the gradually declining interest in Bill Gates (red) and Seinfeld (blue), revived, briefly, by the Microsoft ads:

image

(The blips in 2006 and 2007 for Seinfeld, by the way, are ‘Kramer’s’ racial slurs and Seinfeld’s aptly titled The Bee Movie, by the way.)

Here are the two comediennes, Sarah Palin and Tina Fey, their careers apparently forever intertwined. Palin is of course red:

image

A close-up reveals that Palin might be on the decline, whereas Tina is on the up:

image

Because all these things are relative, put Seinfeld and Tina Fey (red) in the same room and you get an idea of how big a shot she has become this year:

image

Just to stress that last spike:

image

Seinfeld was right when he said he was a has-been. Still a funny guy though.

And I can’t resist taking a look at how Techcrunch and Scoble (blue) face up:

image

Ouch. Seems Scoble started losing ground in in 2006. But hey, who knows? With this new dotcom crunch, maybe he’ll have the last laugh. Gotta admire someone who’s kept his own for 4+ years.

Talking of not leaving the party after it’s over, how does Vista shape up against XP? The chart is surprisingly revealing. Vista (red) enjoys a spike in early 2007 on its launch, but never seems to be able to shake off the XP shadow:

image

That’s one FAIL, I reckon.

Who says graphs are boring?

The Freshness. and T-Shirt Worthiness, of News

(cross-posted from a Loose Wire sister site, ConvergedMedia.net)

image

CNN.com has a good way of informing readers of the ‘freshness’ of news by adding notes in red to indicate when the story was added or updated. (In the example above it also adds a ‘developing story’ label.)

This kind of thing is helpful in that the site can still order stories by their importance, but also flag those that are being updated:

image

(It also adds a rather cute touch to its whacky stories, allowing readers to order a T-shirt with the headline on it:)

image

Click on the T-shirt logo and you’re taken to a page where you can order the shirt:

image

Breaking Out of Those Silos

image

If you’re looking for the future of news, a pretty good example of it is at UK startup silobreaker, which isn’t a farm demolition service but a pretty cool news aggregation and visualization site. In other words, it lets you look at news in different ways. And it’s caught the attention of Microsoft, who today announced it had select the company for its Startup Accelerator program.

The website itself looks pretty normal on first glance–news on the left, three columns of stuff. But look closer. Four boxes on the right offer different sorts of information: a trends chart showing “media attention” (presumably the number of mentions in the news) of different Windows products:

image

Another shows the relationships between Rio Tinto, other companies, topics and cities:

image

And my favorite, a map showing all the places where things are happening in the news. Move your mouse over them and details will pop up in a small box:

image

Drop down lists of topics along the top of the website allow you to select your area, and it’s a satisfying range to choose from. Open the terroism page, for example, and you get a bunch of stories on terrorism, as well a map of hotspots (already zoomed in on the Middle East and Central/South Asia), and a trend map showing how media interest in terrorism in Afghanistan has risen markedly in recent weeks against that of Iraq and the U.S. Who knows how accurate this stuff is, and where it comes from, but it’s still an interesting way to slice and dice the data:

image

Not everything works quite as it’s supposed to but there’s still lots of quality in here, and it puts pretty much every other news site to shame. And it’s not even as if these elements are particularly new; I’ve long sung the praises of newsmaps and mindmaps as a way for online newspapers to get with the program, and it’s frankly been disappointing that so few have tried these things out.