Tag Archives: leader

links for 2008-09-11

  • Avego.com is where travelers cooperate to make the whole transport system more efficient, saving us all money, wasted time and reducing pollution.

    A 5-seat car traveling with only a driver is inherently inefficient, and yet 85% of the time, that’s how cars travel in much of the world. With our iPhone GPS technology, web services and your participation, we can fill up those empty seats.

  • Did I get enough exercise today? How many calories did I burn? Am I getting good quality sleep? How many steps and miles did I walk today? The Fitbit Tracker helps you answer these questions.

  • Swype was developed by founders Cliff Kushler and Randy Marsden, along with a very talented team of software programmers and linguists.

    Cliff is the co-inventor of T9, the standard predictive text-entry solution used on over 2.4 billion mobile phones worldwide. He is the named inventor on multiple patents related to alternative text entry.

    Randy is the developer of the onscreen keyboard included in Windows, with an installed base of over a half a billion units. He is a recognized leader in the field of assistive technology and alternative computer input.

    Together, their experience is unmatched in developing onscreen keyboard-based text input solutions for mobile touch-screen devices.

  • ShiftSpace (pronounced: §) is an open source browser plugin for collaboratively annotating, editing and shifting the web.

  • # Create and track invoices you issue to clients.
    # Determine what you’re owed, by whom, and when it’s due.
    # Keep track of timesheets for yourself and your employees.
    # Notify your clients of new invoices.
    # Create interesting reports and analyze payment history
    # Save time & collect your money.

Counting the Words

image

I’ve been looking recently at different ways that newspapers can add value to the news they produce, and one of them is using technology to better mine the information that’s available to bring out themes and nuances that might otherwise be lost. But does it always work?

The post popular page on the WSJ.com website at the moment is Barack Obama’s speech, which has dozens of comments added to it (not all them illuminating; but there’s another story.) What intrigued me was the text analysis box in the text:

image

Click on that link and you see a sort of tag cloud of words and how frequently they appear in the text of the piece itself. Mouse over a word and a popup tells you how many times Obama used the word. “Black,” for example, appears 38 times; “white” appears only 29. That’s nearly 25% fewer times.

image

Interesting, but useful? My gut reaction is that it cheapens a remarkable speech–remarkable not because of its views, but remarkable because it’s a piece of oratory that could have been uttered 10, 20, 50, maybe even 100 years ago and still be understood.

My point? Analyzing a speech using a simple counter is not only pretty pointless–does the fact he said ‘black’ more times than ‘white’ tell us anything? What about the words he didn’t use?–but it paves the way to speechwriters running their own text analysis over speeches before they’re spoken. “Hey, Bob! We need to put more ‘whites’ in there otherwise people are going to freak out!” “OK how about mentioning you were in White Plains a couple of times last year?”

Maybe this already happens. But oratory is an art form: it doesn’t succumb to analysis, just as efforts to subject Shakespeare to text analysis don’t really tell us very much about Shakespeare.

The Journal is just messing around, of course, experimenting with what it can to see what might work. We’re merely watching a small episode in newspapers trying to be relevant. And it should be applauded for doing so. But I really hope that something more substantial and smart will come along, because this kind of thing not only misses the mark, but is in danger of quickly becoming absurd.

Perhaps more important, it fails to really add value to the data. Without any analysis of the frequency of words, there’s not really much one can say to the exercise except, maybe, “hmmm.” Compare that with a Canadian research project a couple of years back which developed algorithms to measure spin in the 2006 election there. They looked at politicians’ use of particular words: “exception words” — however, unless — for example, and the decreased use of personal pronouns–I, we, me, us– which might imply the speaker was distancing him- or herself from what was being said.

That sounds smart, but was it revealing? The New Scientist, writing in January 2006, said the results concluded that the incumbent, Prime Minister Paul Martin, of the Liberal Party, spun “dramatically more than Conservative Party leader, Stephen Harper, and the New Democratic Party leader, Jack Layton.” Harper, needless to say, won the election.

Oh, and in case you’re interested, Shakespeare used the word “black” 174 times in his oeuvre, according to Open Source Shakespeare, and “white” only 148, 15% fewer occurrences. Clearly a story there.

Press 4 To Give Us All Your Money

I guess it had to happen: phishers are not only trying to snag you by setting up fake banking websites, now they’re trying to snag you by setting up fake switchboards too.

Tim McElligott writes in Telephony Online that scammers “posing as a financial institution and using a VoIP phone number e-mailed people asking them to dial the number and enter the personal information needed to gain access to their finances.” Simply put, the phishers in this case aren’t directing you to a fake website where you enter your password and other data sufficient for them to empty your account; they’re directing you to an automated phone service, where you’d give the same details.

The information comes from Cloudmark (“the proven leader in messaging security solutions for service providers, enterprises and consumers”), which claims in a press release that it has seen two separate such attacks this week:

In these attacks, the target receives an email, ostensibly from their bank, telling them there is an issue with their account and to dial a number to resolve the problem. Callers are then connected over VoIP to a PBX (private branch exchange) running an IVR [an automated voice menu] system that sounds exactly like their own bank’s phone tree, directing them to specific extensions. In a VoIP phishing attack, the phone system identifies itself to the target as the financial institution and prompts them to enter account number and PIN.

As Telephony Online points out, setting up this kind of phone network is easy. “Acquiring a VoIP phone number is about as hard as acquiring an IP address or a domain name,” it quotes Adam O’Donnell, senior research scientist at Cloudmark, as saying. “Phishers figured out how to quickly and fraudulently get that information a long time ago.” An old PC with a voice modem card and with a little PBX software and you’ve got a company’s phone tree which can sound exactly like your bank, O’Donnell says.

This all makes sense. Indeed, we should have seen it coming. It’ll be interesting to see how banks cope with this. Right now their argument has been that if in doubt, a customer should phone them. That no longer is as watertight an option. They could argue that customers should not respond to any email they receive, but that’s also not always true. Banks and other financial institutions need to communicate with customers.

One solution to this is the signature: Postbank last month launched a service where all its emails to customers come with an electronic signature. The only problem with this is that most email clients don’t support the service — only Microsoft Outlook. This is a bit like giving customers a lock that only works on certain kinds of door.

Perhaps banks are just going to have to pick up the phone. If customers are now under threat from automated phone trees maybe the solution is not more technology, but less? A cost the phishers are unlikely to be able to bear would be an actual voice on the other end of the line that sounded familiar and authentic. The only question then would be for the customer to establish the authenticity of the banking assistant.

Cupid’s (Possibly) Poison Arrow

Could Valentine’s Day be a phishing day? Internet Security Systems, Inc. reckons so, saying in a press release (no URL available yet) that the number of dating sites across the world has increased by 17 per cent within the last twelve months. ISS reckons this rise “is partly attributed to the increase in malevolent websites used by developers of malicious code as an opportune moment for phishing, spam and hacker attacks on unsuspecting victims.”

Having said, that, there doesn’t seem to be a lot of strong evidence presented to back this claim up. “Organised criminal units have in the past timed their attacks to coincide with popular celebration occasions in order to achieve maximum success in compromising the integrity of computer systems,” the press release quotes Gunter Ollman, Director of X-Force at Internet Security Systems. “It is anticipated that Valentine’s Day is a day that is similarly marked on the criminals’ calendar for targeted attacks.” Makes sense, but isn’t this a tad alarmist? Should we ignore every Valentine Card we get (assuming we get any)?

ISS offers the usual suggestions about defending yourself from these poisoned Cupid arrows, as well as pointing out that it can provide its own solution, via a “Proventia Web Filter which blocks unwanted web content, optimises Internet access for employees and prevents any kind of non work related Internet use.”. Yes, of course. Ye old “press release as pitch posing as public service ad” trick.

Given that Internet Security Systems, Inc. has been, according to its own blurb, “an established world leader in security since 1994”, I guess I’d expect to see a bit more hard data to back up this kind of scaremongering. It’s not that I don’t believe that scumbags will use Valentine’s Day as a social engineering tool to pry open your gullibility, but I’m not sure security companies should just throw out warnings like this without more carefully callibrated data to justify it. Where is all the data about previous year’s attacks along these lines? Where are the examples to illustrate the problem, and the sophistication of the bad guys? What kind of data are they after? We deserve to be told if we’re going to bin potentially our only chance at happiness.

Firefox Moves To Mass Market?

NetApplications, a ‘leader in Web-based applications that measure, monitor and market Web sites for the Small to Medium Enterprise (SME)’, says (no permalink available) that Firefox “continues to sway users away from Microsoft’s Internet Explorer”.

Firefox reached 8% during the month of May up from 7.38% in April. Firefox’s gain is Microsoft’s loss whose base dipped to 87.23% in May down .77% from April of 2005. Safari also gained a modest tenth of a percentage posting 1.91% in May 2005. Most other browsers experienced little change during the same time period.

NetApplications says IE is losing “an average of .5 to 1% loss of users each month.”  Notes Dan Shapero, Chief Operating Officer of NetApplications: “FireFox is gaining traction with early adopters and its popularity and adoption rate are starting to tap into mass-market acceptance as buzz continues to build.”

May 2005 Browser/Market Share:

  • Microsoft Internet Explorer – 87.23%
  • Firefox – 8.06%
  • Netscape – 1.64%
  • Safari – 1.91%
  • Mozilla – 0.58%
  • Opera – 0.51%
  • Other – 0.07%

The data was collected from over 40,000 Hitslink.com-monitored global Web sites.

Do Viruses Really Cost This Much?

Mi2g, the British-based security consultancy that seems to court controversy and a fair amount of ridicule, has issued a press release (it doesn’t seem to be up yet) that is likely to prompt similar reactions: “USD 166 billion malware damage in 2004”, the headline reads:

The total economic damage from malware – viruses, worms and trojans – in 2004 is estimated to lie between USD 169 billion and USD 204 billion, making 2004 the worst year on record by a wide margin according to the mi2g Intelligence Unit, the world leader in digital risk. 2003 did not log even half of the malware economic damage figures attributable to 2004. With an installed base of around 600 million Windows based computers worldwide, this works out roughly as average damage per installed machine of between USD 281 and USD 340.

Certainly viruses and worms are damaging computers, business and nerves but I’m not sure it stretches to $300 billion. That is the same as(from a quick search of recent news articles):

So I guess it’s not impossible. But it seems to be a bit over the top. Mi2g says it calculates damages “on the basis of helpdesk support costs, overtime payments, contingency outsourcing, loss of business, bandwidth clogging, productivity erosion, management time reallocation, cost of recovery and software upgrades. When available, Intellectual Property Rights (IPR) violations as well as customer and supplier liability costs have also been included in the estimates.” You could pretty much throw any old figures in there.

I would agree with them, however, when they point to the recent “proliferation of Bagle malware variants worldwide” as a sign that, like last year, “there could be a choppy cyber-sea ahead, made all the more complex by new and more dangerous malware families that are yet to emerge.” It may not be costing quite the equivalent of a major war, eradicating global poverty or how much Americans spend on sneakers and baseball games, but a virus sure can muck up your day.

Ukraine Weighs In On The Search Stakes

Another addition to my index of indexing programs: diskMETA, from <META> Inc. “the largest search engine provider in Ukraine and a leader in Cyrillic multilingual search engine morphology technologies”.

A press release issued today says diskMETA is one of the fastest desktop search engines, and is available both as freeware and shareware. The program “is intended for extra large data volumes, UP TO 100 GIGABYTES. It can create up to 100 indexes, index up to ONE MILLION various files. The search time is never more than ONE SECOND”. It works on all Windows platforms (98 or higher).

The file search works with Office document formats (DOC, XLS, RTF, TXT), HTML pages, CHM, PDF files, ZIP and RAR archives. There are three versions: Lite (free), Personal ($50) and Pro, which supports morphological English searches and Intranet wide searches ($100)

The search technology used in diskMETA, apparently, “has a long and glorious history. It is used for a decade in the nationwide biggest and most popular web search engine www.meta.ua, in a series of search tools for web-sites and CD-rooms installed in most governmental and financial national institutions” in the Ukraine.

My tupennies’ worth? It’s fast, intuitive and unfussy. You can also view the raw text in a special preview window, but it doesn’t support preview in the same way that X1, dtSearch or the new Copernic Desktop Search do. That said, it’s great to see a new player on the block, especially one so enthusiastic.

McAfee Comes Late To Rev. Bayes’ Party

McAfee seems to have come somewhat late to the spam party: Network Associates, Inc. , ‘the leader in intrusion prevention solutions’, today announced that it has incorporated “powerful new Bayesian filtering into the latest McAfee SpamAssassin engine”. What, only now?

Bayesian filtering is a pretty powerful weapon in the war against spam. I use POPFile and K9 and would recommend either, not least because they’re free. But why has it taken so long for McAfee to get around to including it in their SpamAssassin product?

To be fair, the McAfee Bayesian filter is “fully automated in its learning abilities, whereas other competitive solutions require manual training by users or systems administrators”. That is an improvement, but I wonder how well it works.

SpamKiller/Assassin also includes some other features, including Integrity Analysis, which applies algorithms to determine if the email is spam, Heuristic Detection, Content Filtering, Black and White Lists and DNS-Blocklist Support.

News: More Spam Tricks

I don’t feel like I’ve passed on anything about spam for at least half an hour so here goes. ActiveState, “the leader in enterprise email management software”, has released an ActiveState Field Guide to Spam, which details advanced tricks used by spammers to hide their messages from spam filters.

Regular readers of this blog — or folk who spend their weekends inspecting spam — will be familiar with most of these tricks, but it’s an education nonetheless. However, I am beginning to think that however clever spammers are, there’s a point beyond which it’s just not worth the effort for them. That’s when we all get Bayesian filters running and tune them. The only spam I worry about these days are press releases like this one from ActiveState. I swear it’s taken me longer to find the right link to their website than it would be to clean the one or two bits of spam that get past by my spamblocker (POPFile, in case you haven’t been paying attention). Or am I missing something?

News: Spam Stats Galore

  If it’s one thing we’re not short of, it’s spam stats. Here are two more, fresh from the PR newswire:
 
Clearswift, “the world leader in managing and securing electronic communications” (I’ll be honest, I hadn’t heard of them until today), has this week launched a Spam Index, in which it has found that “in contrast to recent reports that have suggested pornographic spam
constitutes 60-80 percent of spam, Clearswift’s Spam Index shows that pornographic spam is found only 22 percent of the time. Instead, the largest proportion of spam – 23 percent – was distributed by companies selling direct goods.”
Also, a study released the same day by The Radicati Group Inc., “a leading independent market research firm” found that email traffic has grown 80% over the past year, most of which it blames on spam, which it said represents 24% of total corporate email traffic.
 
Email size, it says, is also on the rise.  Larger and more frequent use of attachments are the primary culprits for this trend. The full press release is only available in Acrobat PDF format.
 
My tuppence: Radicati’s figure for total spam proportion seems way too low. And while I’d agree with Clearswift that porn does not dominate spam — I’m not sure where they got their figures, but their website press release headline blames a “sensationalizing media” for it — there seems to be a reason to be somewhat suspicious of their motives for telling us all this. Telling is a paragraph on their website press release that offers a spin on things:
 
Although it only takes one pornographic email to cause offence and land an organization in litigation for harassment, the level of unsolicited email that falls into the ?healthcare? and ?direct goods? categories suggests the problem of filtering spam is more complex than simply blocking profane and pornographic emails. Deciding whether or not an email is spam ultimately comes down to whether or not it is the result of a well executed and highly targeted email marketing campaign. The ability to deploy flexible spam filtering solutions that can take into account personal preferences will be vital in the fight against spam.
 
To be frank I’m not sure what this means. I think it means: not all spam is spam, some of it is ” well executed and highly targeted email marketing campaign”, and good spam filtering solutions deployed by corporates shouldn’t block all of it because some people might want this stuff in their inbox. I would have thought a company would want to keep out any junk that’s not specifically requested by an employee, especially if it’s for anti-ageing cream or Viagra. Odd, very odd. Can anyone explain this?