[Corpora-List] EmoText - Software for opinion mining and lexical affect sensing

Fri Dec 16 23:07:33 UTC 2011

I share most commenters' observations, although not necessarily some of the criticism. Even if the demo does not live up to its own marketing claims, I wouldn't take that to be a reason to further bash it but rather a reason not to take it too seriously in the first place. In my opinion, no user who tries the system (which is right there for anybody to judge for themselves), should be misled by any amount of marketing.

I agree with Justin and Amanda's linguistic analysis, but I'd contend what Amanda gives as an example of realistic data. Probably for clarity, she seems to have translated into a linguistically correct pair of juxtaposed sentences an original tweet which, as such, must have looked like

"sh*t!!!!! i left my iphone on t bus - im f***** wihtout iiiiiit!!!!! :("

(I am hypothesizing so I may have got the transliterations wrong).

Of course, good luck building anything resembling a syntactic tree from what could only be called a string of characters.

The more general point behind my joke is that rarely does commercial sentiment analysis concern itself with achieving full, deep semantic understanding (fascinating as this may be from a theoretical standpoint). In many situations, reasonable business cases can be built on the basis of detecting *potentially* negative utterances, which is a far less daunting challenge and an application for which there seems to be a market. Many corporations find it satisfactory to spot crises before they happen rather than being told a posteriori with utmost confidence and detail what particular level of hatred they have inspired on their customers. Even if that implies some number of false positives, PR staff are mainly concerned with true positives, which they'll get by maximizing recall. In principle, precision only has to be high enough to filter clearly irrelevant expressions (normally the majority), which is generally true assuming lexical resources have been built in a balanced, domain-aware way.

On the other hand, systems able to correctly deal with Amanda's tweet are becoming increasingly common, at least based on anecdotal evidence and my own personal experience (I could be wrong, of course). So, in her example, if monitoring e.g. "iphone", that noun does not seem an argument of any head (or a head of any modifier) likely to have been assigned a particular sentiment value, which should rule it out as an instance of sentiment regarding the iPhone.

Overall, however, I completely agree it's an open area of research with great challenges.

Jordi

On Dec 16, 2011, at 9:21 PM, Amanda Schiffrin wrote:

> I think Justin has hit the nail on the head here.  I worked on an attempt to develop a sentiment detection module for a text analytics software system in my previous job, and I soon realised that once you start working with real data, both statistical and grammatical ('semantic') approaches will fail.  You need a more complex model of information in order to be able to understand that a tweet such as "Bummer, I left my iPhone on the bus - I'm lost without it :-(", despite containing only indicators of negative sentiment at the lexical level, still expresses high positive sentiment toward the *product*.  Being able to distinguish this kind of sentiment is one of the main drivers of commercial sentiment detection, and I'd say we're still a very long way away from anything like that level of sophistication.
> 
> Mandy Schiffrin
> 
> 
> On 16 December 2011 20:24, Justin Washtell <lec3jrw at leeds.ac.uk> wrote:
> "I would be very sad if this movie did not win a prize."                                        high_neg
> "I'm very happy that the other reviewers have seen this movie for what it is: rubbish."         high_pos
> 
> Rather than (unfairly) singling out this system, I think these examples serve to highlight that this is a very difficult (if not impossibly ill-defined) problem. One cannot just assess the polarity of a statement - one needs to know something about what the object of interest is. In the above cases we are probably interested in [the writer's opinion of] the movie... but that fact is of course *pragmatic* information.
> 
> I'm out of my depth now, so I'll say no more :-) No doubt much has been written on these issues.
> 
> Justin Washtell
> University of Leeds
> 
> ________________________________________
> From: corpora-bounces at uib.no [corpora-bounces at uib.no] On Behalf Of Angus Grieve-Smith [grvsmth at panix.com]
> Sent: 16 December 2011 17:25
> To: corpora at uib.no
> Subject: Re: [Corpora-List] EmoText - Software for opinion mining and lexical affect sensing
> 
> On 12/16/2011 9:01 AM, Alexander Osherenko wrote:
> > You didn't test the approach for complex sentences. I always used the
> > example "I am very sad if ..."
> 
>     I don't want to nitpick, but that's not a very nativelike example
> for a test sentence.  I've only heard English speakers use "I am very
> sad if ..." in habitual or generic contexts, and even then "I get very
> sad when ..." is much more common.  "I would be very sad if ..." is also
> used.  Maybe check your test sentences against the CoCA or something?
> 
> --
>                                -Angus B. Grieve-Smith
>                                grvsmth at panix.com
> 
> 
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
> 
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
> 
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20111217/567d2aad/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora