Entropy! Entropy! They've all got it entropy!

Read this post entitled Entropy Gets No Respect over at The Archdriuid Report. It sums up the way it is and why we don't talk about it. We need to face truths such as these collectively, moreso now than ever before. A great post and well worth your time and consideration.

"The hard reality is that the minority of us who happened to have been born in a few powerful countries squandered half a billion years of stored photosynthesis to give ourselves a brief period of spectacular economic abundance, and by doing so, foreclosed the chance that anybody else would enjoy that same abundance in the future. Fossil fuels are not renewable resources in any time frame accessible to our species. Every barrel and ton and cubic foot of fossil fuel we use now is subtracted from the total available to our descendants; despite an orgy of handwaving, no other resource can provide anything approaching the glut of cheap abundant energy on which our lifestyles of relative privilege depend."

I'm sorry Dave. I'm afraid I can't do that...

In the post Dashboards, scorecards and sentiment I wrote about why I don't think computers can accurately assess the emotional meaning of a sentence. This article from The New York Times entitled Mining the Web for Feelings, Not Facts touches on how performing exactly this function is a "growing business". What's interesting to me is how often I hear it repeated that these algorithms are "70 to 80 percent accurate" often with the addendum that people only agree on the meaning of something 70 to 80 percent of the time. You are being invited to commit a logical fallacy, in as much as the implicit suggestion is that the algorithm is about as accurate as human assessed sentiment. This isn't the case. The article does touch on this highlighting the following:

"A quick search on Tweetfeel, for example, reveals that 77 percent of recent tweeters liked the movie 'Julie & Julia'. But the same search on Twitrratr reveals a few misfires. The site assigned a negative score to a tweet reading 'julie and julia was truly delightful!!' That same message ended with 'we all felt very hungry afterwards' — and the system took the word 'hungry' to indicate a negative sentiment."

In my experience when the computer gets it wrong it gets it wrong in a way a human wouldn't. These monitoring companies are effectively saying that 20 to 30 percent of their data cannot be relied upon. Were the data to be assessed by people the areas of disagreement would actually be highly useful as there are probably interesting reasons as to why the disagreement was occurring, specifically to with context that cannot be assessed by looking at one sentence.

Although this figure of 70 to 80 percent accuracy gets thrown around I have yet to see a monitoring company that supplies a dataset that has been assessed by both its algorithm and a team of humans to prove that this measure of 'accuracy' is one that can be relied on. A set of results like this would also allow us to see where the computer and the human assessors disagree which, given it's people's opinions we are actually trying to quantify is something worth testing.

Most sentiment analysis systems place opinion in one of three buckets, either positive, neutral or negative. This sounds superficially plausible, but if you've ever looked at hundreds of mentions around a keyword or topic you quickly realise that this doesn't really fit with how people express opinions or have conversations. Combined with the lack of accuracy, the lack of nuance in these assessments reduces the value of these tools.

Many dashboards I've seen expect you to take the figures they provide as a given. If you go behind the percentages into the data you start to realise that you do not have a system on which to make reliable judgements, which is after all where the supposed value of these tools lies.

Scout Labs have a good post entitled How does sentiment work? And how accurate is it, anyway? that is worth reading as they try and address these issues which few other companies offering these services have even touched on. They mention the use of Mechanical Turk as a way of being able to assess sentiment using humans and point to a good paper on the problems with this way of doing things. My feeling is that for most of these companies the issue is that they are offering a volume service and that the only way to realistically process the vast amounts of data generated is to use a computer, which for now is an imperfect way of trying to provide what would be very useful information.

Making the news relevant...

One post I read just recently which seemed apropos the fact we are in silly season (how is it possible 'news' itself can be 'slow'?) was entitled: The 3 key parts of news stories you usually don’t get. It summed up the reasons why I feel nearly all mainstream news is a shambling failure whose extinction can't come soon enough. This is apposite given Associated Press not getting it in a very public way, leading on from equally questionable ideas coming from other old media quarters earlier this year.

Hiatus...

(download)

I've had a busy August which was dominated by getting married. This was a tremendously exciting and beautiful day; blogging has thus been far from my thoughts.

After the wedding we went to Cornwall for a few days, a part of the UK I hadn't visited before. We visited the typical tourist attractions which I'm really pleased we did as both the Eden Project and the Lost Gardens of Heligan are amazing. Eden had shades of Logan's Run about it, I was expecting to be summoned to Carousel at any moment.

Cornwall is geared up for tourism, there are B&B's everywhere though booking ahead is essential during the summer months. We stayed at The Avalon guest house in Tintagel and The Chapel guest house near St Austell, both of which were lovely and I'd recommend without hesitation.

I would however mention that on our first night, shattered from a seven hour drive, we ate at a place called The Olive Garden which is just next door to The Avalon in Tintagel. Avoid. All of the worst food I've been served in this country has been at so-called Italian restaurants, perhaps because people who can't actually cook think it is an easy option.

Cornwall was lovely, but all the stories about getting there are true... it takes a day in itself.