[Corpora-List] Framework for EmoText

Alexander Osherenko osherenko at gmx.de
Sat Jan 14 09:30:07 UTC 2012


Dear Alexandre,

thank you for your questions.

First, about the framework. Evidently, I was not successful in explaining
the purpose of the InfoFramework. I implemented the framework to experiment
and to compose software prototypes in opinion mining. However, the
framework is NOT only for opinion mining or lexical processing. It is
rather for statistical  processing generally in every domain, for example,
acoustic or neurobiological processing. Moreover, the outcomes do not have
to be emotional or sentiment-based. The framework (so the name) provides
the basis for experimentation and rapid prototyping and doesn't limit you
to use of particular classification algorithms. Hence, since InfoFramework
is based on WEKA, you can use there all WEKA algorithms available, for
example, NaiveBayes or SVM or many others.

Now your questions.

which predicts a sentiment label (i.e., positive, negative or neutral)
> instead of an emotion. The approach that is currently available, though,
> is based on a dictionary of affect instead of being automatically learnt.
> I once compared EmoLib to a tool similar to yours:
>
> http://atrilla.net/index.php?article=blog&specific=35
>
> and I qualitatively found that Machine Learning approaches tend
> to perform better at the expense of having a poorer generalisation ability
> (domain-specific limitation according to the topic/s of the training text).
> Or put in a different way, affective-dictionary-based approaches perform
> more modestly but are more generalisable (thus avoiding domain transfer
> problems?). Do you have a similar feeling with respect to this?
>
I have exactly the same feeling and also much evidence. For instance, we
published the "Lexical Affect Sensing: Are Affect Dictionaries Necessary to
Analyze Affect?" (
http://edu.cs.uni-magdeburg.de/EC/lehre/sommersemester-2010/emotional-computing/informationen-zum-seminar/blog/annotation/AreAffectDictionariesNecssaryFulltext.pdf).
In my phd, I already relied on these findings and used, for example,
stylometric, grammatical or deictic features. I also used Whissell's DAL as
a source of lexical features to prove my hypothesis -- the results are much
poorer if you compare with opinion mining using the BNC frequency list. QED


> At least this is my mind and that's why I have not delivered a service
> with any of the learnt methods that EmoLib also implements (basically the
> Multinomial Naive Bayes, the Vector Space Model, LSA, Multinomial Logistic
> Regression and SVM). Which technique does EmoText use?
>
> In my phd, I tried to answer such core data-mining questions as choice of
classifier, feature evaluation and so on. I compared results of NaiveBayes,
SVM, InformationGain with SVM. In my opinion, the choice of classifiers is
not important. More important, is the choice of features and explanation of
obtained results.


> Moreover, EmoLib first splits the sentences of the input text, then
> predicts the sentence-wise sentiment labels independently, and finally
> draws the affective wash at paragraph level. Do you follow a similar
> approach in EmoText?
>
> In my phd, I considered splitting texts in sentences and provided results
for, as you can call it, hybrid approach that combines the semantic and the
statistical approach to opinion mining (first classification of sentences
and classifying longer texts as paragraphs). However, the results are worse
than classification using only lexical features.

Best
Alexander

Thank you, indeed.
>
> Alex
>
>
> > Dear all,
> >
> > I put a brief description of my framework on the Internet that I
> > implemented in the context of the statistical EmoText (
> > www.socioware.de/technology.html#framework). You might notice some
> > resemblance with the WEKA Experimenter. Since I didn't have  any
> brilliant
> > idea on how to name this framework, I called it simply InfoFramework.
> >
> > Although I tested the framework in the context of opinion mining, I
> assume
> > it can be used for any kind of statistical processing.
> >
> > Best
> > Alexander
> > _______________________________________________
> > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > Corpora mailing list
> > Corpora at uib.no
> > http://mailman.uib.no/listinfo/corpora
> >
>
>
> --
> _________________________________________________
>
>  ALEXANDRE TRILLA
>  B.Sc., M.Sc. in Electronics, Telecommunications
>  Engineering and Information Technology
>
>  Email: alex at atrilla.net
>  Homepage: http://atrilla.net
> _________________________________________________
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120114/a07f1039/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list