[Corpora-List] EmoText - Software for opinion mining and lexical affect sensing
Alexander Osherenko
osherenko at gmx.de
Sat Dec 17 10:14:47 UTC 2011
I didn't invent this world. I only observe it. The system to sentiment
analysis mustn't be perfect but comprehensible. The main idea of my
contribution was an announcement of a real-life system to sentiment
analysis that can be applied to real scenarios. The system relies on
scientific findings in my phd thesis; it works and the results are not that
bad. They could be better and I also describe in the thesis how they can be
improved. It's up to you, you can ignore them and loose time.
To my knowledge, the statistical engine summarizes many findings of
statistical processing that I read about during preparation of my thesis
and adds much more. The same applies to semantic engine. With some
imagination the issues of vague interpretation can be explained or
neglected. I also described a hybrid approach and a fusion approach that
combine both engines and can be considered in future. AO
2011/12/17 Jordi Carrera Ventura <excellens at gmail.com>
> I share most commenters' observations, although not necessarily some of
> the criticism. Even if the demo does not live up to its own marketing
> claims, I wouldn't take that to be a reason to further bash it but rather a
> reason not to take it too seriously in the first place. In my opinion, no
> user who tries the system (which is right there for anybody to judge for
> themselves), should be misled by any amount of marketing.
>
> I agree with Justin and Amanda's linguistic analysis, but I'd contend what
> Amanda gives as an example of realistic data. Probably for clarity, she
> seems to have translated into a linguistically correct pair of juxtaposed
> sentences an original tweet which, as such, must have looked like
>
> "sh*t!!!!! i left my iphone on t bus - im f***** wihtout iiiiiit!!!!! :("
>
> (I am hypothesizing so I may have got the transliterations wrong).
>
> Of course, good luck building anything resembling a syntactic tree from
> what could only be called a string of characters.
>
> The more general point behind my joke is that rarely does commercial
> sentiment analysis concern itself with achieving full, deep semantic
> understanding (fascinating as this may be from a theoretical standpoint).
> In many situations, reasonable business cases can be built on the basis of
> detecting *potentially* negative utterances, which is a far less daunting
> challenge and an application for which there seems to be a market. Many
> corporations find it satisfactory to spot crises before they happen rather
> than being told a posteriori with utmost confidence and detail what
> particular level of hatred they have inspired on their customers. Even if
> that implies some number of false positives, PR staff are mainly concerned
> with true positives, which they'll get by maximizing recall. In principle,
> precision only has to be high enough to filter clearly irrelevant
> expressions (normally the majority), which is generally true assuming
> lexical resources have been built in a balanced, domain-aware way.
>
> On the other hand, systems able to correctly deal with Amanda's tweet are
> becoming increasingly common, at least based on anecdotal evidence and my
> own personal experience (I could be wrong, of course). So, in her example,
> if monitoring e.g. "iphone", that noun does not seem an argument of any
> head (or a head of any modifier) likely to have been assigned a particular
> sentiment value, which should rule it out as an instance of sentiment
> regarding the iPhone.
>
> Overall, however, I completely agree it's an open area of research with
> great challenges.
>
>
> Jordi
>
> On Dec 16, 2011, at 9:21 PM, Amanda Schiffrin wrote:
>
> I think Justin has hit the nail on the head here. I worked on an attempt
> to develop a sentiment detection module for a text analytics software
> system in my previous job, and I soon realised that once you start working
> with real data, both statistical and grammatical ('semantic') approaches
> will fail. You need a more complex model of information in order to be
> able to understand that a tweet such as "Bummer, I left my iPhone on the
> bus - I'm lost without it :-(", despite containing only indicators of
> negative sentiment at the lexical level, still expresses high positive
> sentiment toward the *product*. Being able to distinguish this kind of
> sentiment is one of the main drivers of commercial sentiment detection, and
> I'd say we're still a very long way away from anything like that level of
> sophistication.
>
> Mandy Schiffrin
>
>
> On 16 December 2011 20:24, Justin Washtell <lec3jrw at leeds.ac.uk> wrote:
>
>> "I would be very sad if this movie did not win a prize."
>> high_neg
>> "I'm very happy that the other reviewers have seen this movie for what it
>> is: rubbish." high_pos
>>
>> Rather than (unfairly) singling out this system, I think these examples
>> serve to highlight that this is a very difficult (if not impossibly
>> ill-defined) problem. One cannot just assess the polarity of a statement -
>> one needs to know something about what the object of interest is. In the
>> above cases we are probably interested in [the writer's opinion of] the
>> movie... but that fact is of course *pragmatic* information.
>>
>> I'm out of my depth now, so I'll say no more :-) No doubt much has been
>> written on these issues.
>>
>> Justin Washtell
>> University of Leeds
>>
>> ________________________________________
>> From: corpora-bounces at uib.no [corpora-bounces at uib.no] On Behalf Of Angus
>> Grieve-Smith [grvsmth at panix.com]
>> Sent: 16 December 2011 17:25
>> To: corpora at uib.no
>> Subject: Re: [Corpora-List] EmoText - Software for opinion mining and
>> lexical affect sensing
>>
>> On 12/16/2011 9:01 AM, Alexander Osherenko wrote:
>> > You didn't test the approach for complex sentences. I always used the
>> > example "I am very sad if ..."
>>
>> I don't want to nitpick, but that's not a very nativelike example
>> for a test sentence. I've only heard English speakers use "I am very
>> sad if ..." in habitual or generic contexts, and even then "I get very
>> sad when ..." is much more common. "I would be very sad if ..." is also
>> used. Maybe check your test sentences against the CoCA or something?
>>
>> --
>> -Angus B. Grieve-Smith
>> grvsmth at panix.com
>>
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20111217/e9964b09/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list