Elegantly simple proposal to measure economic confidence in The Economist’s search for other quirky indicators: searches in the U.S. on Google for “gold price” in the piece Alternative indicators: Behind the bald figures
But the hottest tip came from Edward Ritchie, an investment analyst in London. He tracks Google searches for the “gold price” as an indicator of economic confidence. This does not follow the gold price itself. For example, during most of 2008 when the world’s financial system was melting down, the gold price tumbled yet the number of searches soared. The number of gold-price searches shoots up when American consumer confidence dives and subsides when households perk up again (see chart). That makes it a handy device for spotting turning-points in economic confidence, with the added advantage that the data are available earlier than for conventional survey-based figures. Worryingly, the number of searches has recently vaulted above its 2008 peak, signalling the possibility of a double dip.”
Here’s the graph:
I’m a big fan of using Google search to measure, track and predict things. A few of my previous posts on the matter. And no, I’ve not made any money so far out of this crystal ball.
How To Use Google To Get Round Super Injunctions
Technoratis Decline, Death of Blogging?
Googling the Tsunami
Googles Suicide Watch: where I googled the word “suicide”
Has Quora Peaked?
Fail, Seinfeld and Tina Fey: A Zeitgeist
The Financial Crisis in Charts
Hoodiephobia, Or We Don’t Lie to Google
And this one from 2006: Mapping Trends With Google
There’s a new search engine out there, according to the Guardian, and it sort of tries to figure out what you’re looking for. Which is good. Google searches are great so long as they’re simple. But is Powerset up to snuff?
Here are some searches I did (betraying my interests):
Pretty good stuff. And how about me?
Even less obvious matches seem to work:
Also right on the money. Nixon got second place when I asked who was the first u.s. president to resign? which is good enough:
Other searches tho — how many copies of Office 2007 has Microsoft sold? and how far is it from London to Sydney — weren’t any good at all.
Of course, Powerset is so far only parsing Wikipedia articles (only — there are 2.3 million of those in the English language). And ask Google the same questions and you’re also likely to get the answers high up (1st in the case of Nixon, Taser inventer, Suharto resignation, though nowhere on my own alleged career (fittingly). Sydney/London throws up a WikiAnswers page, and I’ve given up hope trying to find out how many copies of Office 2007 have been sold.)
Still, it’s early days for something like this. There’s no question that a better search engine will one day come along, perhaps belonging to Google, perhaps not. Will it need to parse every sentence for meaning? Who knows?
This week in the WSJ.com (subscription only, I’m afraid) I wrote about web spam — the growing penetration of faux websites that ride up the search engines and muddy the Internet for all of us. I based it around the recent case of subdomain spam, well documented by the likes of blogs like Monetize. Briefly websites controlled by one Moldovan hit the high rankings on several major search engines using techniques that are imaginative, but not exactly beyond the intelligence of savvy search engine builders. It’s not as intrusive as spam in your inbox but it’s trashing the web and undermining the usefulness of search engines.
But it’s not just ordinary search results that get spammed. It’s news. A search for “ringtones” on Google News, for example, throws up “free mono ringtones” as the top item:
(“Ringtone” throws up similar results.) Amazing, not only is it the top story but all the six “related” stories you can see as a green link below the four are from the same domain, advertising a range of goods that can hardly be lumped together with ringtones, including sildenafil and tenuate. (Searches of those words on Google News also have the same domain as top ranked, at least at the time of writing. Here and here. In fact the results for tenuate do not throw up a single news story; all eight matches are web spam.)
The sites in question are all subdomains of www.vibe.com, an online magazine which is indexed by Google news for its pieces on musicians. The pages that hit the top rank of results for ringtone and ringtones, however, are community messageboard pages, and clearly marked as such, which makes me wonder how either the web spammer is fooling the Google bots into indexing pages which are clearly not news by any definition, or why Google’s bots aren’t doing the job they’re supposed to be doing.
Yahoo! News’ search doesn’t do much better: Its first hit is a web spam site under the domain www.ladysilvia.net, which doesn’t even pretend to be a news site:
(MSN’s news search comes out well, without any spam in sight, as does A9, which is basically the same engine.) But why are these sites getting indexed and included in news searches? I can only assume ringtones are such big business that it’s worth the web spammers doing their damndest to push their results up not only ordinary search rankings, but I would have thought Google and Yahoo! would be on top of this. Apparently not.
As part of Google’s localization push, the search engine will figure from your IP address what country you’re in and configure itself accordingly into a local site, with a local country suffix, and, sometimes, local language options. Fair enough, except for travellers and people who are quite happy with things as they are.
Running Google in Hong Kong, for example, throws up pages on a www.google.com.hk page that, without asking first, look like this:
With literally no English language entries on there, or any way to change the language back from Chinese, on the results page or the preferences page. In fact, I cannot figure out how to change it back at all. So I’m moving over to Clusty, at least for now. Help, anyone?
Will all libraries eventually be digital?
Seems a pretty obvious question (answer: yes) but the process is surprisingly slow. I do research online and use databases like Questia but there’s still a hell of a lot that hasn’t been made available. And a lot of what is scanned has not been scanned well, unless the original material contained a lot of misspelled names.
Anyway, here’s a glimpse of what may be happening soon. From the excellent OnlineJournalism.com Newsletter — the daily news Weblog of the USC Annenberg Online Journalism Review — is a link to a report from CyberJournalist.net, which in turn “keyed in on an anonymous tip buried deep inside a Sunday New York Times feature” on Google and Microsoft: “Apparently Google plans to digitize every post-1923 [[correction: should be pre-1923; makes more sense. Thanks Jim]] text within the Stanford University Library, creating an enormous copyright-free resource available solely to Google users. The ambitious operation is codenamed Project
Ocean, according to The Times’ unnamed source.”
Wow. That’s about 18 libraries, ranging from the Art and Architecture Library to the Linear Accelerator Center Library (although that link doesn’t work, which doesn’t augur particularly well…)
This on top of Google Print blurb search and Amazon’s Inside the book search (both are shameless links to postings on this very site.)
Here’s another whacky trick that Google have quietly introduced, adding to the impression they are fast cementing their role as one-stop portal: Book searching. According to SearchEngineWatch (via the excellent TechDirt), Google Print is an experimental service that “indexes excerpts of popular books, blending the content from these works into regular Google search results”.
These excerpts are usually the blurb, for now. True to its apparent intention to make itself indispensable before it starts collecting cash, Google says book sellers pay nothing for links from these search results, and it is not benefiting if you make a purchase from one of these retailers. It’s likely that Google will eventually do what Amazon does already, namely offer full text searches of books, although these kind of searches will have to be crippled in some way to prevent users from downloading whole books online.
Can’t remember where I read this, but of course all this has wonderful side-effects for those of looking for something in a book we already own: So long as Amazon (or later, Google) have the book scanned, it would be quicker to do a keyword search there than to check the index, or leaf through the chapter list. Voila.