[Corpora-List] EmoText - Software for opinion mining and lexical affect sensing
Alexander Osherenko
osherenko at gmx.de
Fri Dec 16 17:10:23 UTC 2011
John,
I agree with you, the difference is really impressive. The author (James
Berardinelli meant 2 stars). In my opinion, you have to consider language
variance. In some cases, an author tends to use grammatical means to
express opinion; in other cases, it is not beneficial. In some cases an
author tends to use lexical means; in other cases, it is not beneficial. An
so on. To counteract this variance, we can use an aggregate vote: majority
or average.
How I extract grammatical features? Leech and Svartvik consider 9-10
grammatical rules, for example, fronted negation (Not every house is so
beautiful!) or repetitions (It is a big big house). In data mining, if I
identify a pattern in a text that corresponds to particular rule I
increment a corresponding feature value. It doesn't matter much if this
identification was correct -- the approach is robust since it relies on an
aggregate vote.
Your semantic example. As I already mentioned, I don't extract slang words.
That's why the word "crap" in example "This demo is a lot of crap" is not
in the dictionary and you get the meaning "neutral". In the second case,
"This demo gets rid of the crap" you get the meaning "low_neg" -- it is not
because "crap" magically got a meaning, but because you supply other words.
My engine uses words of 4 dictionaries: Wordnet-Affect with Dictionary of
Affect, positive/negative GeneralInquirer (GI), Levin verbs. Some
dictionaries define distinct emotion words such as good or bad; some
dictionaries as negative GI relies on lexical affinity that means according
to Pang a particular emotional orientation of a word. In your example, the
resulting "low_neg" emotional meaning is calculated by accident. Emotional
meaning is expressed not by the word "crap" as expected but by words "get"
and "rid" that are taken from the negative GI. I assume that according to
GI these words express most times a negative meaning. You can try the
example containing only one word "rid" in order to see the meaning.
However, your example reminds me on the connection of semantic and
grammatical meaning. In this case, word "rid" plays a role of a negation as
word "not" or "never" or "except" and I think all of such words can be
considered in the system dictionary. I have already discussed a similar
issue earlier in connection with implicit negation (
http://mailman.uib.no/public/corpora/2007-October/005412.html).
Alexander
2011/12/16 John F. Sowa <sowa at bestweb.net>
> Alexander,
>
> I tried the default movie review with both the naive Bayes and
> the SVM options. For most of the options, both versions evaluated
> the review as average (2.5 stars).
>
> But with the option "with a dataset containing grammar features",
> naive Bayes dropped to 1.5 stars, and SVM dropped to 0.5 stars.
>
> That's an impressive difference. Could you say something more about
> how the system uses grammar features to derive those results.
>
> In particular, it would be helpful to select some sentences from
> that default text for which the grammar features make a significant
> difference and say how that difference was derived.
>
> For the semantic demo, I tried the following two sentences:
>
> "This demo is a lot of crap." Rating: neutral.
>
> "This demo gets rid of the crap." Rating: low-neg.
>
> That's the reverse of what one might expect.
>
> John
>
>
>
>
> ______________________________**_________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/**corpora<http://mailman.uib.no/options/corpora>
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/**listinfo/corpora<http://mailman.uib.no/listinfo/corpora>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20111216/be74ba45/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list