[Corpora-List] EmoText - Software for opinion mining and lexical affect sensing

Michal Ptaszynski ptaszynski at media.eng.hokudai.ac.jp
Fri Dec 16 17:40:27 UTC 2011


Hi Alexander,

First of all, congratulations for having the courage to present the demo.
There aren't a lot of this kind of demos around, and I think we should  
appreciate it.
I will however aggree with Justin that you should put the demo on a more  
neutral webpage. If I was your potential client, what I would do first  
would be entering some rubbish (flavored with some vulgarities maybe) to  
find the first example when I could say "A-ha! It was a crap after all!"  
The faster/more one can find bad examples, the faster they will exit your  
website.

> I don't think that the work in semantic affect sensing is some day so  
> good that you can say "there is nothing to do anymore".

I think this applies to any field, but I also think we will all aggree  
that the demo results are just still far from being good. We can assume  
you use more resources in the full version, but if you do, you could give  
some examples on the webpage, of the sentences properly processed by the  
full version, but failing in the demo.

Also, a few words on the nomenclature. Why do you call this "semantic"?  
Does, in your opinion, using a lexicon already makes it semantic? Or did  
you just want to make a clear distinction to the "statistical"?

Next nomenclature-related thing is the phrase "affect sensing". It is  
perhaps a candidate for a separate discussion, but let me make the point.
It is a phrase used first by Hugo Liu, perhaps because he did not have a  
better word for it. It seems most researchers that undertook similar  
research after him just wanted to stick with the same name and not many  
people try to clear up the nomenclature.
In the process of "sensing" what you need is a "sensor". Now, when you try  
to talk about "lexical affect sensing" the ground becomes a bit slippery,  
because it is not clear what would be the sensor in the case of language.  
Here is an examplary list of sensors.
http://en.wikipedia.org/wiki/List_of_sensors
All of them share two features. They get activated when something happens  
(1/0) and they are separate from systems that analyse the I/O of the  
sensors. So you could, say, use a skin condictivity sensor to measure if a  
person is sweating, but you will need a separate system to analyse the  
"sweating=1" information as, e.g., "nervous".
So if you built a system for affect sensing based on one of the 5-senses,  
the phrase "affect sensing" would be perfectly in place. But the language  
is not a kind of "sense" per se.
A good replacement phrase here is, e.g., "affect analysis". Do you think  
cleaning up this nomenclature problem is necessary, or do you have no  
objections with using the "affect sensing" phrase?


Best,

Michal

BTW, I'm sure you did your research, but you could try to compare your  
results with Sentistregth.
http://sentistrength.wlv.ac.uk/
I mean comparing the available demos, not the systems per se in a research  
paper, etc..


-------------------------------
Od: Alexander Osherenko <osherenko at gmx.de>
Kopia dla: "Corpora at uib.no" <corpora at uib.no>
Do: Justin Washtell <lec3jrw at leeds.ac.uk>
Data: Fri, 16 Dec 2011 17:03:46 +0100
Temat: Re: [Corpora-List] EmoText - Software for opinion mining and  
lexical affect sensing

Hi Justin,

thank you for your suggestion. Before I rewrite the contents for  
unexperienced users I wanted to show you the site and to hear your  
comments.

I don't think that the work in semantic affect sensing is some day so good  
that you can say "there is nothing to do anymore". The language is so  
multifold that it is impossible. I can add more words as "crap" or  
"unimpressed" but it is not the issue.

What I am trying to sell is an approach and a method to enrich it. I can  
say now what grammatical core to use -- a consumer has to do his part and  
add more emotion words or phrases. In my opinion, nobody will be ever able  
to sell a finished product although many people would claim they are doing  
so.

Alexander

2011/12/16 Justin Washtell <lec3jrw at leeds.ac.uk>
  Hi Alexander,

  Thanks for the very attentive and informative response. There is no doubt  
that this is a tough problem area, and I have absolute respect for anybody  
working towards a solution. Of course, without having yet read your  
publications I cannot fully appreciate the gains that you have made. But I  
think therein lies the point I was really alluding to...

  The demos as they presently stand seem at odds with the very strongly  
worded "sales pitch" of your site. Partly perhaps because they are  
difficult for the non-technical user to understand, and partly perhaps  
because of they are so easily thwarted. No doubt this performance is  
indeed hampered by some of the limitations you have just identified, but  
you do present them as a product rather than as a work-in-progress.

  Might it be better to put these particular demos on a more honest and  
informative site aimed at a technical/academic audience (i.e. where you  
can freely to acknowledge their limitations). And then - while I am  
somewhat loath to encourage it - you could conjure up something that is a  
little more "on rails" for your business site?

  Justin Washtell
  University of Leeds

  ________________________________________
From: osherenko at gmail.com [osherenko at gmail.com] On Behalf Of Alexander  
Osherenko [osherenko at gmx.de]
Sent: 16 December 2011 14:01
  To: Justin Washtell
Cc: Corpora at uib.no
  Subject: Re: [Corpora-List] EmoText - Software for opinion mining and  
lexical affect sensing

  Hello Justin,

  Thanks for your comments.

  1. Statistical demo.

The goal is to measure the opinion of a reviewer for a particular movie  
review. It is typical to measure opinion using number of stars. You can go  
to www.reelviews.net<http://www.reelviews.net> and convince yourself.

  Many part results show what you would get if you extract particular  
features. For example, you can extract stylometric features and get A  
stars. You can extract lexical features and get B stars. The final result  
(majority or average) is calculated on the basis of the part results.

  You can extract stylometric, deictic, grammatical, lexical features  
calculated on the basis of the review and analyze your review using, for  
example, NaiveBayes. As a part result you can also optimize feature space  
or fuse results using BayesNet.

  2. Semantic demo.

  I don't think you are doing something wrong. But you have to know: it is  
only a demo and BTW I also want to learn something. :) This demo relies on  
theoretical findings of Leech and Svartvik "A Communicative Grammar of  
English" and has to be extended to analyze real-life utterances. Hence,  
your examples are very helpful.

  I don't want to show how many words I use for analysis. Although I use  
about 4000 words it is not enough. "I'm fairly unimpressed" -- the word  
"unimpressed" is not in the dictionary that's why "only" neutral. You  
might want to try "It is not good" or "It is good" and its variants if it  
is not too trivial for you. Big dictionaries are not the issue because I  
can extend my dictionaries accordingly. I also didn't use big slang  
dictionaries.

  In contrast, I want to show that combinations of negations, intensifiers,  
emotion words are sufficient to analyze affect. For example, in the  
example "This demo is far from brilliant." the word "far" can be  
considered as negation and the combination <negation><emotion word>  
calculates the desired meaning. In other example "couldn't be better",  
there is something that concerns comparative and has to studied more  
thoroughly in future.

  You didn't test the approach for complex sentences. I always used the  
example "I am very sad if ..."

  Best
  Alexander

2011/12/16 Justin Washtell  
<lec3jrw at leeds.ac.uk<mailto:lec3jrw at leeds.ac.uk>>
Hello Alexander,

  I tried both of your demos out of interest.

  For the first demo I used the default options (the movie reviews and  
Naive Bayes). I did not understand the output, or how it was supposed to  
relate to the various parts of the input (if indeed it is?)

  For the second demo I entered the following sentences and received the  
following classifications:

  This demo is terrible. low_neg
  This demo is no good at all. high_pos
  This demo is far from brilliant. low_pos

  This demo is excellent. low_pos
  This demo is not bad at all. low_pos
  Thid demo couldn't be better! low_neg

  Am I doing something wrong?

  I would presently dispute your claimed "undisputable advantages". I am  
not sure whether your intended customers - who are presumably not language  
technology experts - will require less or more convincing.

  Justin Washtell
  University of Leeds

  ________________________________________
From: corpora-bounces at uib.no<mailto:corpora-bounces at uib.no>  
[corpora-bounces at uib.no<mailto:corpora-bounces at uib.no>] On Behalf Of  
Alexander Osherenko [osherenko at gmx.de<mailto:osherenko at gmx.de>]
Sent: 16 December 2011 08:46
To: Corpora at uib.no<mailto:Corpora at uib.no>
Subject: [Corpora-List] EmoText - Software for opinion mining and lexical  
affect sensing

  Dear all!

  Recently I made an announcement of a book about opinion mining and  
lexical affect sensing. In this contribution I would like to point you to  
the EmoText demo program that relies on the findings in this book. It was  
implemented for the European CALLAS project.

  The link is:
www.socioware.de/products.html<http://www.socioware.de/products.html><http://www.socioware.de/products.html>.

  I apologize for some advertising.

  Kind regards
  Alexander Osherenko

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list