Tag Archives: sentiment

Why Sentiment Analysis Sucks for Social Media Monitoring

March 31st, 2010

First off, thanks to Seth Grimes for getting so engaged in discussion about this important topic. Before moving on, few relevant references.

The below article is partially in response to: Is Sentiment Analysis an 80% Solution?

The original post that initiated the conversation  why sentiment analysis sucks for social media monitoring (attempt 1)

…which in turn was a response to a discussion which was ongoing at the time Don’t Get Sentimental About Tools When Measuring Attitude.

What’s Sentiment Analysis Good For (in social media monitoring)?

The fundamental flaw in number based positive/negative approach to sentiment analysis is not in the maths, technology or practicality. It is in the fact that it starts from an assumption that people are something they’re not.

Every person’s life tends to happen at the same basic levels. We’re all a person with an idea of this fixed being, which we call me. Then we go about our lifes experiencing things, these we call our first kiss or “auch, I hurt my knee”. Sometimes we feel the need to express these experiences, that is what I’m doing right here, expressing myself.

Screen shot 2010-03-31 at 1.20.52 PM

Each of these is a diluted version of the previous. As a person we feel fixed and we feel ourselves, then within that we have an experience. The way we experience events is entirely depended on our person. For example when someone dents your car, it is entirely up to you how you react in that situation. If you’re indifferent about it, then there is no significant experience. You just take his details and get it fixed. Or you get angry and talk for days about how someone dented your car.

When you take your experiences and put them in to words, they’re further diluted from the actual substance, the richness of human experience. The idea of being able to take human experience and fit it on a scale of 0-100 in terms of positive or negative is ridiculous.

When experiences are verbalized, a natural distortion happens, in a way the experience itself is corrupted by the attempt of limiting its richness to words. What sentiment analysis is trying to do, is to say that it can capture the essence of the expression (experience and person behind it) and record it as a single numeric value.

As a consumer I maybe someone who gets pissed off and expressive about bad experiences, but I’ll be the first to praise you when you redeem yourself. Or I could be someone who never says anything, good or bad. How is this accounted for in the current situation and direction for text analytics? Brands are not looking for instances, but relationships.

While I understand the usefulness of text analytics to answer yes/no questions in a closed domain with good preparation and proper customization, this is a very limited approach. I’m always more interested to know why people preferred that someone guided them personally instead of just giving directions, or how the ones who didn’t get personal guidance felt when they just got directions. The current approach to sentiment analysis at best offers limited solutions to such an approach.

Bottom line is that you can’t classify people, experiences or expressions on a scale of positive or negative. We are not that type of creatures. There is no such a situation that is totally positive or totally negative. Our relationships with brands are no different from the way we interact with life at large. Those relationships hold all the complexities and richness of our personalities, experiences and expressions.

The Human Factor

The fact that people don’t see things similarly in terms of positive or negative is no surprise at all. Classic philosophists knew this thousands of years ago, it is one of the underlying concepts in virtually every religion, philosophy or other system.

We can be affected by so many different things; weather, economics, relationships, time of day, medication. Attributes such as the ones mentioned before are used widely in econometrics to model actual situations in which commerce happens.

To further complicate things, there is the whole dimension of our relationship with ourselves, the way in which we understand and don’t understand our own personas, experiences and expressions.

We’re left with that other approach in which I show 10 different people pictures of 10 angry people and 10 happy people, or I show 10 passionate people and 10 passive people, the situation becomes much more human. We’re that kind of beings, we get angry and happy, then we’re sad. That is the level at which we relate, with each other, with brands and with the world around us.

I’m a big fan of automation and always believed that we should thrive to automate everything we believe machine can do better than us. The rest we leave for ourselves to do. The way net sentiment is utilized in social media monitoring is something I think should be left completely alone. At the level of net sentiment scoring, it is not worth the time of human nor machine.

There is a better solution for both man and the machine in this situation. The fact that something was started 15 years ago in a certain way doesn’t necessarily means it’s the best way. Our job is to make sure that we’re all open for what ever ways may be out there.

We all eventually want the same thing, so defending one’s convictions becomes a slippery slope. In Zen there is a saying: “In the beginner’s mind there exists many possibilities, in expert’s mind exists only few”. After doing one thing for a really long time, I find this to be the most valuable guideline.

So instead of using our time defending the ivory towers of the text analytics industry and where it’s at now, let’s figure out where we can take it together!

In A True Spirit of Debate

Below my responses to some of the arguments made in the post Is Sentiment Analysis an 80% Solution?

Test data about people agreeing on things with 80% accuracy has little to do with how and why a single system (social media monitor technology) has a 20% error margin. It’s like comparing pears to bananas. The way these language systems works is that there is a set of rules as base for everything and there is plenty of secret sauce in all of this.

No more seems the example about InfoGlutton relevant. When it comes to language based systems, success is all about teaching the system to work in that given environment (defining the rules). When you have a domain specific system (restaurants) with a limited number of entities (below 100k), continuously optimizing the system is an option. But when you work in an open generic domain (the internet) and you have virtually unlimited number of entities which produce indefinite amount of unique content, tweaking the system becomes very problematic. Think of the difference of learning the 300 most common words in Spanish versus internalizing all great philosophies in their original languages.

All this being said, often when you start looking things from two extremes, you’ll eventually find the golden middle way most suiting. My hope is that we can do that by working together on directions that make most sense for everyone.

Thanks so much for the chance to have this discussion Seth, and thanks everyone for taking the time to read this through.

What Is Automated Sentiment Analysis Good For?

March 6th, 2010

If you’re coming from the post Is Sentiment Analysis an 80% Technology? you may want to continue directly to my post written partially in response to Seth:

Why Sentiment Analysis Sucks for Social Media Monitoring?

Or continue to read the original post that initiated the debate.

———

To save you the trouble of reading further (and few minutes of your time), the answer is “not much”.

Leading providers such as Sysomos and Radian6 estimate their automated sentiment analysis and scoring system to be 80% accurate. That sounds quite good, right?

It doesn’t at all.

20% difference statistically is huge and comes with an array of problems. Think about a situation in which you’re comparing something with 59% positive sentiment to something that has 65% positive sentiment. That’s less than 10% difference. In other words, whenever you have a situation where you’re looking at current data without long historical trend as a reference (for example the Oscars), these types of numbers are completely useless. Almost exclusively, the numbers sentiment vendors provide have differences within the 20% range.

For example, if [A] has positive sentiment of 50% and [B] has positive sentiment of 60%, with 80% accuracy this could mean that [A] is anything between 40%-60% and [B] is anything between 48% and 72%. The only thing this tells us is that statistically speaking [B] might be seen as more positive, but then again, it says the same thing about [A]!

From the below graphic you can see how the benchmark value with 59% positive sentiment can change the whole graph with just 20% variation. So it could be anything from completely positive to quite negative.

Possible scenarios with 20% accuracy

Statistics are wonderful for BS.

When you come from the analytics industry and start developing tools for the analytics industry, some things are clear from the get-go. The fact that customers have been teached to ask for sentiment scoring (thank you very much early snake oil peddlers) doesn’t mean the vendors should invest R&D in it. When the ticker feature became available to websites in 96-97, everyone wanted it. Hmmm, come to think of it now, maybe that’s why we struggled with revenues back in the day, after all we refused tickers.

To be a pioneer in an industry is a tremendous responsibility. It’s like teaching a child. If you teach that red is green and green is blue, then that is what the child will learn. If other adults reinforce this message, it becomes the common reality. If the vendor’s teach the market that sentiment is the way to go, then that is what the customers will expect thus forcing future vendors in to a situation where being competitive means doing sentiment better. In other words, wasting R&D resources in trying to fix something that is broken (sentiment analysis) instead of looking at what the customer is really looking for. Customer is always looking for the same thing, make more money. So this model is quite simple.

Brands exist in order to make money, that is the harsh reality we live in. But it’s also a very workable reality from the vendor perspective. Sentiment is nice to know, but up until today I haven’t heard a single commercial application for it. Commercial application in this case means an application that serves the purpose of earning more money for someone  else than the vendor of the application.

The question remains:

“How does knowing the net sentiment score help me to drive more commerce?”.