<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-15">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-forward-container">
<div class="main"><br>
<h1><small><small><small>Post-doctoral position: Event-based
multi-document summarization for building timelines</small></small></small><br>
</h1>
<p><a moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://perso.limsi.fr/Individu/xtannier/fr/Stages/post_doc_2012_chronolines.html">http://perso.limsi.fr/Individu/xtannier/fr/Stages/post_doc_2012_chronolines.html</a><br>
</p>
<h3><small>Keywords</small></h3>
<i>information extraction, natural language processing, temporal
analysis, events, timelines</i>
<h3><small>Location</small></h3>
LIMSI-CNRS, Orsay (Paris), France.<br>
<br>
<h3><small>Duration</small><br>
</h3>
1 year<br>
<br>
<h2><small><small>Context</small></small></h2>
<p> Among other objectives, national funded project <a
moz-do-not-send="true" href="http://www.chronolines.fr">Chronolines</a>
aims at creating semi-automatic timelines from a query, based
on a collection of newswire papers. Given a user-defined topic
and a set of texts, the task consists in <strong>extracting
the most important events</strong> concerning the topic and
to present them to the user for validation. The ideal output
would then be a set of brief descriptions of events, together
with the dates of these events. </p>
<p> Work on this project already resulted in a few publications,
among which a paper at ACL 2012 on <em>salient dates
extraction</em>, that the candidate can refer to for more
details <a moz-do-not-send="true"
href="http://aclweb.org/anthology-new/P/P12/P12-1077.pdf">[1]</a>.
The candidate would be integrated into this project, working
in the project team on some of the following issues: </p>
<ul>
<li><strong>Aggregation/Summarization</strong>: how to
choose/generate a brief description of each event, from a
set of relevant sentences.</li>
<li><strong>Evaluation</strong>: what metrics, what
methodology for objective evaluation.</li>
<li><strong>Granularity</strong>: as the time unit for our
salient date algorithm is the day, how to decide that
several topic-related important events occurred on the same
day or, inversely, that an important event lasted more than
one day.</li>
<li><strong>Relationship</strong>: how to use the big
collection of articles to extract some relationship between
events?</li>
</ul>
<h2><small><small>Required skills</small></small></h2>
<p> The candidate should hold a PhD in Natural Language
Processing and/or Information Retrieval, and be able to: </p>
<ul>
<li>Work with texts (interest in linguistic issues and how to
deal with them)</li>
<li>Work with <em>a lot</em> of texts (good programming
skills, big corpora management, information aggregation,
ability to forget about linguistic issues when we need to)</li>
<li>Learn from (imperfect) references (ability to observe and
generalize, machine learning skills)</li>
<li>Work with tools used and built by the team (in Linux,
Java, perl...)</li>
</ul>
<h3><small>Contacts:</small></h3>
Xavier.Tannier[at]limsi.fr <br>
Veronique.Moriceau[at]limsi.fr <br>
<br>
<br>
<h3><small>Reference: </small></h3>
<p id="footnote-1">[1] Rémy Kessler, Xavier Tannier, Caroline
Hagège, Véronique Moriceau, André Bittar. <strong><a
moz-do-not-send="true"
href="http://aclweb.org/anthology-new/P/P12/P12-1077.pdf">Finding
Salient Dates for Building Thematic Timelines.</a></strong>
In <em>Proceedings of the 50th Annual Meeting of the
Association for Computational Linguistics (ACL 2012)</em>.
Jeju Island, Republic of Korea, July 2012. © Association for
Computational Linguistics.<br>
</p>
<p><br>
<br>
</p>
</div>
<div class="moz-signature">-- <br>
Xavier Tannier <br>
Maître de conférence<br>
LIMSI-CNRS (bât. 508, bureau 12, RdC)<br>
Université Paris-Sud 11<br>
B.P. 133 <br>
91403 ORSAY CEDEX <br>
FRANCE <br>
<br>
<a moz-do-not-send="true"
href="http://www.limsi.fr/%7Extannier/">http://www.limsi.fr/~xtannier/</a>
<br>
tel: 0033 (0)1 69 85 80 12 <br>
fax: 0033 (0)1 69 85 80 88 <br>
----------------------------------------------------------- <br>
</div>
<br>
<br>
</div>
<br>
</body>
</html>