Software That Plays Tag

This week’s WSJ.com column (subscription only, I’m afraid) is about Jiglu, a sort of automatic tagging service you can see in action somewhere on this blog:

If you’re a writer, you hope your words will be etched in stone for eternity. If you’re a blogger, you’re happy if someone stumbles on your writings a few days after you posted them. Blogs, partly because they often consist mainly of commentary on things that have just happened, and partly because of the way they are structured (most recent postings first, making it easy to ignore everything you wrote before), are a transient medium. Rarely is a blog post treated as permanent. We write, then we forget.

The problem, I conclude, is that amidst all the writing, and despite the power of tagging

Blog posts, left to themselves, tend to have a short shelf life.

Briton Nigel Cannings thinks he has the solution to this: automatic tagging. He sees value in all those old blog posts of mine (he may be the only one) and reckons all that old content out there is a repository of wisdom that just needs to be sorted out better. Tagging it ourselves, he thinks, just isn’t enough because we don’t always see what we’ve written in a broader context. “Manual tagging is the first step” to sorting and storing blogs and other online content better, he says, “but it still relies upon people understanding themselves, whatever they’ve already written about, and how their content fits in with other people’s content.”

More at Loose Wire – WSJ.com.

Technorati Tags: , ,

How To Remember Stuff

I long suspected this was the case, and now we’ve proof: Try too hard to remember something and you can almost feel yourself forgetting it. Stop trying to remember and it will come back. Of course, this could be extended to other mental activity: Your brain can only cope with so much stuff, so better to let it float and do what it wants to do. If it’s a good brain and has plenty to feed on, it should give you what you want in its own sweet time. Hey, a slacker’s manifesto.

clipped from scienceblogs.com

One explanation for this fascinating failure of memory is retrieval-induced forgetting, in which the retrieval of closely related concepts and words actually competes with the word or concept you intended to retrieve (discussed previously). The intended item becomes available only after the residual activity among the incorrectly retrieved items has decayed.

Copernic’s New Search

Copernic have officially launched version 2 of their excellent Desktop Search software. It’s been around in beta for a while and it’s excellent, though I’d still like to see more economical use of screen space. Not all of us are working on high-res big screens. Press release isn’t out yet but should be here when it is. Pre-release page is here.

Technorati Tags: , ,

Delicious Additions

Here’s a wonderful new addition to the del.icio.us process of adding a tag to a web page — the posting page predicts from your existing tags what you are typing, and offers suggestions based on your existing tags.

Not just that: below the fields are some recommended tags you have already assigned to other pages which match ‘popular tags’ assigned by others to the page being tagged (there’s probably a better way of putting it, but it’s still early here):

Delicious

Just for good measure the posting page lists all your existing tags, the recommended tags highlighted in red.

Very nice. But not only nice: At a stroke this tackles a couple of perceived problems with del.icio.us: the supposed anarchy of people tagging the same pages with different tags, and the problem of you yourself ending up with too many different, but similar, tags for things that should probably have had the same tags. (There’s bound to be another side to this discussion where people would argue this ‘anarchy’ is part of the greatness of the tagging taxonomy, but I’m not going there this morning.)

Hats off to Joshua and the team. Amazing how simple and intuitive they have managed to keep del.icio.us, where it’s easy to do everything but very hard (at least for me) to explain.

dtSearch: Not Dead. Not Yet.

Despite my love of indexers (and I’m in Seventh Heaven now that all the big boys are throwing out desktop search engines like it was a Bay City Rollers’ reunion) I still stick for most of my searching with dtSearch. It’s expensive, it’s tough, it’s ugly, but it gets the job done. And now they’ve added a feature which might not get you too excited, but for me is key: better viewers (or file parsers, if you want to get technical) for Microsoft documents.

Version 6.5 of dtSearch Desktop (free to those who are 6.x users) means you can see Word documents or Excel spreadsheets or PowerPoint presentations in their original glory. Now folks are going to say, well I can do that with X1 or more or less any of the other indexers that include built-in viewers, but I’d like to correct you: You can’t. Well you can if you don’t have big files, but over a certain size, you will get an error. And I have big Word files, all tabled up, and they nearly always don’t appear. In dtSearch they did come up, but not in with any decent formatting. Now they do. (Other features listed here.)

DtSearch, long the mainstay of a once sparse field, is not going away quietly. Good for them.

Another Indexing Program…

Further in my pursuit of the perfect search and indexing software, Sean Franzen points me to Vancouver-based Wisetech Software and their Archivarius 3000 which, he says, “recognizes more file formats than DiskMeta, allows you to index data on network drives and locate your indexes on network drives. The price is very competitive also. Development has been very active for the past six months.”

It looks interesting and worth checking out. On initial glance it lacks the thing I love most about X1, Enfish and the others: a preview pane built in that lets you view the whole file, not just the context of the found string. Archivarious costs between $20 and $45, depending on whether you’re a student, and individual or a commercial entity.

Ukraine Weighs In On The Search Stakes

Another addition to my index of indexing programs: diskMETA, from <META> Inc. “the largest search engine provider in Ukraine and a leader in Cyrillic multilingual search engine morphology technologies”.

A press release issued today says diskMETA is one of the fastest desktop search engines, and is available both as freeware and shareware. The program “is intended for extra large data volumes, UP TO 100 GIGABYTES. It can create up to 100 indexes, index up to ONE MILLION various files. The search time is never more than ONE SECOND”. It works on all Windows platforms (98 or higher).

The file search works with Office document formats (DOC, XLS, RTF, TXT), HTML pages, CHM, PDF files, ZIP and RAR archives. There are three versions: Lite (free), Personal ($50) and Pro, which supports morphological English searches and Intranet wide searches ($100)

The search technology used in diskMETA, apparently, “has a long and glorious history. It is used for a decade in the nationwide biggest and most popular web search engine www.meta.ua, in a series of search tools for web-sites and CD-rooms installed in most governmental and financial national institutions” in the Ukraine.

My tupennies’ worth? It’s fast, intuitive and unfussy. You can also view the raw text in a special preview window, but it doesn’t support preview in the same way that X1, dtSearch or the new Copernic Desktop Search do. That said, it’s great to see a new player on the block, especially one so enthusiastic.

This week’s column – Hard-Disk Hunters

This week’s Loose Wire column is about hard disk indexers, a topic familiar to those of you reading this blog. 

CONSIDER THIS: Your hard drive probably contains more info than you could ever imagine. Say you’ve got a modest hard drive of 20 gigabytes. That’s the equivalent of about 20 copies of the Encyclopedia Britannica. Or 20,000 floppy disks. That’s a lot of stuff, and, chances are, you have little or no idea what’s actually on there or, if you do, how to find it. Be ignorant no more: Help is at hand.

Now, I know we’ve been here before. One of my bugbears has been the lack of a decent program to find files on your computer. By this I don’t mean looking for anything particularly obscure, just your last letter home, or the e-mail you got from the accounts department demanding your expense report from covering the Burma Campaign. Simple stuff, and it’s always annoyed me that Internet search engines do this so much better on the world wide Web than they do on our own Word files or e-mails. (Mac fans will chime in at this point and say they’ve always had this feature; Windows fans will say XP has its own search-and-index function. But, with respect to both groups, I’d say neither is particularly useful and, in the case of XP’s, practical. It’s clunky, hard to figure out, and slows your computer down to a snail’s pace.) But now sharp new programs promise to do something about this, and they are aimed directly at the casual user who just wants to find stuff, without a lot of fuss.

In the column I mention most of the indexers listed here. Full text at the Far Eastern Economic Review (subscription required, trial available) or at WSJ.com (subscription required). Old columns at feer.com here.

The New Search Wars

Search is getting big again. Will it work this time around?

Programs that search your hard drive have been around for a while, but few of them seem to last. There was Magellan, askSam (OK, still around, sort of), Altavista’s Desktop Search, dtSearch (still going strong) and Enfish (still around, barely breathing). That was in the 1990s. But it’s only recently we’ve seen folk get really excited about the space again: There’s X1, Tukaroo (bought out pre-launch by Ask Jeeves), HotBot Search, and now something called blinkx (thanks, Marjolein, for pointing it out.)

Blinkx was officially launched last month as “a free new search tool that thinks and links for you, eliminates the need for keywords or complex search methods, easily finding the information you seek whether it is on the Web, in the news or buried deep within files on your PC.” In other words, pretty much what the other guys do. I haven’t looked too closely at it, but the main idea, as co-founder Kathy Rittweger puts it, is easy search without the logistics: “By eliminating the mechanics of search, such as keywords or sorting through dozens of unqualified results, we drive users more quickly to their goal: finding something, even if they didn’t know it was there!”

That’s good, and I would have said before that that was the way to go, but nowadays I’m not so sure. I think that as disk space grows and people’s hard drives become more complex, different users need different grades of configurability. With most of these new search engines pitching to the ‘lite user’ there’s a danger the more serious document hunter gets left behind. It’s actually a simple calculation: Are you aiming at the casual user who is happy to stumble across a few documents they didn’t know they still had, or are you aiming at the user that needs to find all the documents relevant to their search?

Anyway, it’s good to see folk finally seeing this space for what it is: Horribly underserviced, full of missed opportunities and millions of folk lost on their own hard drives. With Google, Microsoft and others about to enter the fray, here’s hoping that we get something really good out of it.