[Corpora-List] EmoText - Software for opinion mining and lexical affect sensing
iain
iain at idcl.co.uk
Wed Dec 21 14:23:16 UTC 2011
My own suspicion in terms of what Taras was saying is that the expense is in
making the classifier suitable for a domain or genre. It is quite clear
that something which can decode tweets may not work too well on blogs or
forums like this one!
Equally, a general classifier won't be as effective on a sub-language as the
one it's trained on. Scientific vitriol can be couched in terms which only
the very adept outsider will recognise!
Iain
-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Michal Ptaszynski
Sent: 21 December 2011 13:35
To: corpora at uib.no; corpora-request at uib.no
Cc: ptaszynski at hgu.jp
Subject: Re: [Corpora-List] EmoText - Software for opinion mining and
lexical affect sensing
Dear Taras, Iain
Dear All,
What Iain and taras say is one of the best things I've heard lately, mostly,
because it confirms my findings too. However, your experience is probably
based more on real world examples.
If you could provide a proof of some kind, or a description of some
examples, this would be a very useful hint.
Just a word on "keeping classification cheap". I think this is not as much
about the money, as it is about logic (and trying to find it).
For example, it is not much of a research to just have, for example, 100
people write a lot of rules. Even if a system cerated this way would achieve
high performance its not too interesting from the scientific point of view.
What we, researchers, try to find is a kind of logical reasoning that could
be represented computationally. So, for example, if Mr.X has a
1000-rule-system that gives him 85% accuracy, and Mr.Y has a
10-(general)-rule-system that gives him 82% accuracy, a researcher would
rather be first interested in Mr.Y's system.
I think this applies to all fields that have their commercial variations.
For example, each year there is a number of papers on machine translation
presenting high results, but the level of actual machine translation
software available on the market is rather low (As a former translator I
tried about 5 different ones).
Best,
Michal
---------------------
Od: Taras <taras8055 at gmail.com>
Do: corpora at uib.no
Data: Wed, 21 Dec 2011 10:43:47 +0000
Temat: Re: [Corpora-List] EmoText - Software for opinion mining and
lexical affect sensing
Hi
I am a developer of one of commercial tools. And I think there are two
majour problems that prevent them being more accurate:
1. They try to keep classification cheap. Cheap means generic. But the
only way of getting a good sentiment accuracy is making classifiers
specific. But this is expensive.
2. The other problem is the neutrality bias. In most cases texts are
usually neutral or balanced, and it makes extraction of non-neutrals very
difficult. The problem actually is not subjectivity or sentiment
classification taken separately, but the combination of the two.
Of course there are other problems: noisy language, various ways of
expressing sentiment etc. But the two aforementioned are the most
business-specific ones.
Regards,
Taras Zagibalov
On 21/12/11 09:52, iain wrote:
I've been following this thread with interest. I'm a commercial
semi-lurker
rather than an involved theorist, but my colleagues and I have done some
work with some of the available commercial sentiment tools.
Our experience is that they are really not very accurate.
There are some issues with evaluating them. I'm not using a pre-marked
gold
standard to score them, but rather submitting text from web pages to them
and comparing the output with the text, which makes our results far from
scientific! And we've done dozens not thousands.
What we tend to find is that we look at the output and then at the text and
more often than not say, 'uh'. Pretty much as the reviewers of Alexander's
test site have been doing! Which might make Alexander's work close to the
commercial state of the art :-J ....
Some of the reviews of commercial tools I've seen seem to indicate that if
you take the 'neutral' sentiment articles out then the actual accuracy
drops
down from the claimed 70% quite considerably. In short, the tools are very
good at detecting no sentiment but rather poorer at getting actual
sentiment
right.
I was wondering if anyone on the list had experience with the commercial
tools and what sort of results they found. Could they recommend one or
another of the suppliers? I'd also be interested if any tool suppliers
(also commercially semi-lurking ) might have some input to this - what is
their real expectations of quality?
Iain
-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Justin Washtell
Sent: 20 December 2011 20:32
To: Alexander Osherenko; ptaszynski at ieee.org
Cc: corpora at uib.no; corpora-request at uib.no
Subject: Re: [Corpora-List] EmoText - Software for opinion mining and
lexical affect sensing
Michal and Alexander,
I thoroughly agree with Michal (and Graham) that these kinds of demo are a
good thing, and despite my - ongoing - criticisms, I'd like to take my hat
off to Alexander for sharing this work. There are already countless papers
describing technical approaches to this-and-that, and showing
impressive-looking results achieved upon [perhaps sometimes carefully
selected or tuned-to] test datasets. But I suspect that there's presently
no
better way to get a feel for where the state-of-the-art really is (and to
shed some qualitative light on matters) than by complementing these works
with some inquisitive and unrestrained hands-on tinkering.
I tried a couple of reviews from Amazon. Among different feature sets from
1 to 6, always one is close to the amazon's ranking, but unfortunately its
never one feature set in particular, but rather randomly one from the six.
Besides the closest method, all other are usually reversed (e.g., if the
closest method gives 5 star, all other give 1). However, this might have
just happen for those couple examples I tried (Reviews of Kindle on
Amazon).
Isn't that more-or-less what one would expect from random output?
Social aspects. You have to consider that the reviews from Amazon are
composed by different authors that have their own style of writing.
Moreover, you have to consider different cultural background, for example,
Americans and Englishmen use different words to express same things. Goethe
used other words than a truck driver does.
As a human, and an Englishman, I expect I can understand and fairly judge
the sentiment of most reviews written by, say, an American truck driver,
without undue reprogramming. Is this really an unrealistic goal for our
algorithms? And I wonder, is mastering a highly restricted style or
register
a necessary step in that direction... or is it in fact a detour.
How can a classifier calculate a weight of a lexical feature if this
lexical feature is not present in the analyzed text?
By inferring from similarities between that feature and those that *are*
present (e.g. through semi-supervised learning/bootstrapping of unannotated
data)? That's at least one method about which a fair amount has been
written
already. I'm not saying its a solved problem mind you, but perhaps you're
not up against a brick wall yet?
Justin Washtell
University of Leeds
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list