Job: Post-doc at LIMSI-CNRS, Orsay, France
Thierry Hamon
thierry.hamon at UNIV-PARIS13.FR
Fri Oct 5 19:01:22 UTC 2012
Date: Wed, 03 Oct 2012 09:20:40 +0200
From: Xavier Tannier <xtannier at limsi.fr>
Message-ID: <506BE748.9060207 at limsi.fr>
X-url: http://perso.limsi.fr/Individu/xtannier/fr/Stages/post_doc_2012_chronolines.html
X-url: http://www.chronolines.fr
Post-doctoral position: Event-based multi-document summarization for
building timelines
http://perso.limsi.fr/Individu/xtannier/fr/Stages/post_doc_2012_chronolines.html
Keywords
/information extraction, natural language processing, temporal analysis,
events, timelines/
Location
LIMSI-CNRS, Orsay (Paris), France.
Duration
1 year
Context
Among other objectives, national funded project Chronolines
http://www.chronolines.fr aims at creating semi-automatic timelines from
a query, based on a collection of newswire papers. Given a user-defined
topic and a set of texts, the task consists in *extracting the most
important events* concerning the topic and to present them to the user
for validation. The ideal output would then be a set of brief
descriptions of events, together with the dates of these events.
Work on this project already resulted in a few publications, among which
a paper at ACL 2012 on /salient dates extraction/, that the candidate
can refer to for more details [1]
http://aclweb.org/anthology-new/P/P12/P12-1077.pdf. The candidate would
be integrated into this project, working in the project team on some of
the following issues:
* *Aggregation/Summarization*: how to choose/generate a brief
description of each event, from a set of relevant sentences.
* *Evaluation*: what metrics, what methodology for objective
evaluation.
* *Granularity*: as the time unit for our salient date algorithm is
the day, how to decide that several topic-related important events
occurred on the same day or, inversely, that an important event
lasted more than one day.
* *Relationship*: how to use the big collection of articles to extract
some relationship between events?
Required skills
The candidate should hold a PhD in Natural Language Processing and/or
Information Retrieval, and be able to:
* Work with texts (interest in linguistic issues and how to deal with
them)
* Work with /a lot/ of texts (good programming skills, big corpora
management, information aggregation, ability to forget about
linguistic issues when we need to)
* Learn from (imperfect) references (ability to observe and
generalize, machine learning skills)
* Work with tools used and built by the team (in Linux, Java, perl...)
Contacts:
Xavier.Tannier[at]limsi.fr
Veronique.Moriceau[at]limsi.fr
Reference:
[1] Rémy Kessler, Xavier Tannier, Caroline Hagège, Véronique Moriceau,
André Bittar. *Finding Salient Dates for Building Thematic Timelines.
http://aclweb.org/anthology-new/P/P12/P12-1077.pdf* In /Proceedings of
the 50th Annual Meeting of the Association for Computational Linguistics
(ACL 2012)/. Jeju Island, Republic of Korea, July 2012. © Association
for Computational Linguistics.
-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version :
Archives : http://listserv.linguistlist.org/archives/ln.html
http://liste.cines.fr/info/ln
La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion : http://www.atala.org/
-------------------------------------------------------------------------
More information about the Ln
mailing list