Appel: INFILE at CLEF2008

Thierry Hamon thierry.hamon at LIPN.UNIV-PARIS13.FR
Mon May 5 10:59:43 UTC 2008

Date: Fri, 02 May 2008 14:47:01 +0200
From: info at
Message-ID: <481B0D45.4090404 at>

Apologies for cross-postings

                        Call for participation

                           INFILE at CLEF2008

                  Information, Filtering, Evaluation

INFILE welcomes participation of any institution to its first
evaluation campaign. This participation is free of charge and
participants can keep and use the development and evaluation data for
free after the evaluations for research and development purposes.

INFILE (INformation, Filtrage, Evaluation) is a cross-language
adaptive filtering evaluation campaign jointly organized by CEA,
Université de Lille 3 and ELDA. It is organized as a pilot track in
CLEF 2008 and is supported by NIST TREC.

INFILE extends the last filtering track of TREC 2002 in the following
ways: -INFILE is crosslingual (English, French and Arabic); a corpus
of 100,000 comparable news-wire stories from Agence France Presse
(AFP) for each language is used for the evaluations.

-Evaluation will be performed using an automatic interrogation of test
systems with a simulated user feedback. Each system will be able to
use the feedback at any time to increase performance.

The participant systems will have to provide a Boolean decision for
each document according to each filtering profile. A curve of the
evolution of efficiency will be computed.  Although cross-lingual
systems are encouraged, the campaign is also open to monolingual

Tasks and languages
Two tasks and three languages are considered. The first task is
Information Filtering on general news and events. For this task,
participants will have to classify each transmitted news-wire into
zero, one or more different profiles. 30 general profiles will be made
available in 3 languages (Arabic, English and French).

The second task is Information Filtering on science and technology
domain. Participants will have to associate to each news-wire zero,
one or more science and technology profiles.  A total of 20 profiles
will be available in 3 languages (Arabic, English and French).

For each task, participants are free to register to monolingual
filtering (e.g. information filtering using profiles and news-wires in
the same language) or to crosslingual filtering (e.g. information
filtering according to profiles in one language and news-wires in
another language).

The corpus consists of 300,000 news-wires in Arabic, English and
French from the news agency Agence France Presse covering the
2004-2006 period.  The news-wires are related to general news and
events information and are comparable between Arabic, English and

Protocol description

General information about the domain of profiles is given to each
participant. 15 days afterwards, 50 profiles are given to participants
(30 general profiles and 20 profiles related to science and
technology). Profiles are composed of a list of keywords (simple and
complex noun phrases) and up to 3 documents illustrating each profile.
Then, news-wires are transmitted by the organizer to an automated
interface of each participating system. The interface returns a
Boolean response for each profile. After reception of this response,
and if requested by the participant, the organizer sends a feedback
consisting of expected profile assignments for each document
submitted. Participants may adapt their system at any time using this

Important Dates
    Registration Opens - Feb 11th, 2008
    Dry Run - June 2nd to June 14th, 2008
    Evaluation Run June 30th - July 19th
    Release of Human Assessments and Individual Results - August 4th, 2008
    Submission of Paper for Working Notes - 15 August 2008
    Workshop - 17-19 September 2008 CLEF Workshop

    info at

Message diffuse par la liste Langage Naturel <LN at>
Informations, abonnement :
English version       : 
Archives                 :

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  :

More information about the Ln mailing list