[Corpora-List] Call for participation DEFT'09
Martine Hurault-Plantet
Martine.Hurault-Plantet at limsi.fr
Tue Jan 27 09:44:25 UTC 2009
********************************************************************
DEFT'09 Call for participation
Evaluation workshop in text mining:
multilingual analysis of opinion
http://deft09.limsi.fr/
********************************************************************
Important dates:
Registration: from december 10, 2008
Training corpora: january 7, 2009
Test: three days during the last two weeks of march 2009
Workshop: end of june (one-day workshop)
********************************************************************
DEFT'09, fifth edition of the DEFT evaluation campaign in text mining
(http://deft.limsi.fr/), will deal this year with the multilingual analysis
of opinion.
Analysis of opinion, which already was the subject of a preceding edition of
DEFT, is a topic of interest in more than one way. Companies thrive on it,
sometimes in addition to more classic opinion polls, and the Web provides
this analysis with a profuse matter found in blogs, in sites dedicated to the
evaluation of products, or in online newspapers. The applications include the
analysis and follow-up of a public or mediatic "image", as well as
developments in the spheres of business, (image of a product, of a service,
of a company), of public life (images of mediatic people) and of politics
(how a political project is perceived).
An analysis of opinion begins with the detection of the more or less
subjective nature of a text or fragment, i.e. deciding if it is conveying a
sentiment, a judgement, an opinion, or if it is stating plain facts. Parts of
text containing an opinion are then analyzed to assign a value to the
expressed opinion, either according to a positive/negative polarity, or
according to a scale of values (see DEFT'07). Lastly, judgements on a given
topic may be tainted by more global opinions, e. g. political position, or
may reflect these opinions.
Within this famework, we propose three different tasks which can be addressed
separately:
- The first task will be the detection of the objective or subjective
character of a text, with a corpus of newspaper articles in French, English
and Italian, from the sections Letter from the Editor, Debates, Analyses,
News in local and foreign politics and Economy. The reference is established
by projecting each section on both the subjective and the objective
dimension. For instance, the Letter from the editor, which usually states an
opinion, has the type subjective, while the News, describing actual facts,
have the type objective.
- Detecting the subjective parts of a text (which may be globally objective or
subjective) will be the second task. In addition to the aforementioned
newspaper corpus, it will use a set of parliamentary debates in French,
English and Italian. The reference will be established by crossing results
from competitors: subjective fragments will be those tagged as such by a
majority of participants. The majority threshold will be determined
empirically, by checking the annotations produced by the parsers.
- The third task will be the detection of the political party to which a
speaker belongs, in the same three political corpora. This party will have to
be picked from a finite set of European parties.
Competitors must accomplish at least one of the three tasks. Each task must be
accomplished at least on the French corpus.
Teams taking part in DEFT'09 must register via the online form and sign the
end user contract for linguistic resources in the framework of an evaluation
project (http://deft09.limsi.fr/index.php?id=5&lang=en).
Training corpora will be made available to registered paticipants from January
7, 2009. These corpora are composed from 60% of original corpora. The
remaining 40% of corpora will be used for the test. This test will take place
within a 14-day interval, from the middle of March. Starting from a chosen
date within the interval, participants benefit from three days to apply on
the test corpora the methods developed on the training corpora.
********************************************************************
Committees:
Organising Committee:
Heads: Martine Hurault-Plantet, Cyril Grouin
(LIMSI)
Members: Béatrice Arnulphy, Jean-Baptiste Berthelin,
Sarra El Ayari, Anne Garcia-Fernandez, Arnaud Grappy,
Isabelle Robba, Pierre Zweigenbaum (LIMSI)
Programme Committee:
Head: Patrick Paroubek (LIMSI)
Members:
Catherine Berrut (LIG)
Fabrice Clérot (France Telecom)
Guillaume Cleuziou (LIFO)
Béatrice Daille (LINA)
Marc El-Bèze (LIA)
Patrick Gallinari (LIP6)
Thierry Hamon (LIPN)
Fidélia Ibekwe-SanJuan (ELICO)
Pascal Poncelet (LIRMM)
Jean-Michel Renders (XRCE)
Christophe Roche (LISTIC)
Mathieu Roche (LIRMM)
Pascale Sébillot (IRISA)
François Yvon (LIMSI - TLP)
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list