[Corpora-List] Call for participation DEFT'09

Martine Hurault-Plantet Martine.Hurault-Plantet at limsi.fr
Tue Jan 27 09:44:25 UTC 2009


********************************************************************
DEFT'09   Call for participation

Evaluation workshop in text mining: 
multilingual analysis of opinion

http://deft09.limsi.fr/

********************************************************************
Important dates:

Registration: from december 10, 2008
Training corpora: january 7, 2009
Test: three days during the last two weeks of march 2009
Workshop:  end of june (one-day workshop)

********************************************************************
DEFT'09, fifth edition of the DEFT evaluation campaign in text mining 
(http://deft.limsi.fr/), will deal this year with the multilingual analysis 
of opinion. 

Analysis of opinion, which already was the subject of a preceding edition of 
DEFT, is a topic of interest in more than one way. Companies thrive on it, 
sometimes in addition to more classic opinion polls, and the Web provides 
this analysis with a profuse matter found in blogs, in sites dedicated to the 
evaluation of products, or in online newspapers. The applications include the 
analysis and follow-up of a public or mediatic "image", as well as 
developments in the spheres of business, (image of a product, of a service, 
of a company), of public life (images of mediatic people) and of politics 
(how a political project is perceived).

An analysis of opinion begins with the detection of the more or less 
subjective nature of a text or fragment, i.e. deciding if it is conveying a 
sentiment, a judgement, an opinion, or if it is stating plain facts. Parts of 
text containing an opinion are then analyzed to assign a value to the 
expressed opinion, either according to a positive/negative polarity, or 
according to a scale of values (see DEFT'07). Lastly, judgements on a given 
topic may be tainted by more global opinions, e. g. political position, or 
may reflect these opinions.

Within this famework, we propose three different tasks which can be addressed 
separately:

- The first task will be the detection of the objective or subjective 
character of a text, with a corpus of newspaper articles in French, English 
and Italian, from the sections Letter from the Editor, Debates, Analyses, 
News in local and foreign politics and Economy. The reference is established 
by projecting each section on both the subjective and the objective 
dimension. For instance, the Letter from the editor, which usually states an 
opinion, has the type subjective, while the News, describing actual facts, 
have the type objective.

- Detecting the subjective parts of a text (which may be globally objective or 
subjective) will be the second task. In addition to the aforementioned 
newspaper corpus, it will use a set of parliamentary debates in French, 
English and Italian. The reference will be established by crossing results 
from competitors: subjective fragments will be those tagged as such by a 
majority of participants. The majority threshold will be determined 
empirically, by checking the annotations produced by the parsers.

- The third task will be the detection of the political party to which a 
speaker belongs, in the same three political corpora. This party will have to 
be picked from a finite set of European parties.

Competitors must accomplish at least one of the three tasks. Each task must be 
accomplished at least on the French corpus.

Teams taking part in DEFT'09 must register via the online form and sign the 
end user contract for linguistic resources in the framework of an evaluation 
project  (http://deft09.limsi.fr/index.php?id=5&lang=en).

Training corpora will be made available to registered paticipants from January 
7, 2009. These corpora are composed from 60% of original corpora. The 
remaining 40% of corpora will be used for the test. This test will take place 
within a 14-day interval, from the middle of March. Starting from a chosen 
date within the interval, participants benefit from three days to apply on 
the test corpora the methods developed on the training corpora.

********************************************************************
Committees:

Organising Committee:
Heads: Martine Hurault-Plantet, Cyril Grouin
(LIMSI)
Members: Béatrice Arnulphy, Jean-Baptiste Berthelin, 
Sarra El Ayari, Anne Garcia-Fernandez, Arnaud Grappy,
Isabelle Robba, Pierre Zweigenbaum (LIMSI)

Programme Committee:
Head: Patrick Paroubek (LIMSI)
Members:
Catherine Berrut (LIG)
Fabrice Clérot (France Telecom)
Guillaume Cleuziou (LIFO)
Béatrice Daille (LINA)
Marc El-Bèze (LIA)
Patrick Gallinari (LIP6)
Thierry Hamon (LIPN)
Fidélia Ibekwe-SanJuan (ELICO)
Pascal Poncelet (LIRMM)
Jean-Michel Renders (XRCE)
Christophe Roche (LISTIC)
Mathieu Roche (LIRMM)
Pascale Sébillot (IRISA)
François Yvon (LIMSI - TLP)


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list