Job: Open position at the EC's Joint Research Centre: multilingual text analysis - Reminder

Thierry Hamon thierry.hamon at LIPN.UNIV-PARIS13.FR
Sat Mar 7 10:18:46 UTC 2009

Date: Thu, 05 Mar 2009 12:48:25 +0100
From: Ralf Steinberger <ralf.steinberger at>
Message-id: <011f01c99d88$4ad07ba0$e07172e0$%Steinberger at>

This is a reminder that the deadline to apply for the post-doc
position is approaching: 15 March 2009.
There may be more than one post.
The link to the job description in the previous announcement was wrong
(id=7 instead of id=67). We apologise for this mistake.
The European Commission’s Joint Research Centre (JRC in Ispra, at the Lago
Maggiore in Northern Italy, has one or more openings for a three-year
position in multilingual text analysis (see below). Applicants will
either need to have completed a Ph.D. or have five years of relevant
post-graduate experience.
The JRC is running several public news aggregation and analysis web
portals (see and provides a number of
services to a wide range of international customers. A strong focus in
the JRC’s work is on multilinguality and on tools to provide
cross-lingual information access.
Applications (3-page
application form, an updated CV in English and a copy of your
passport/ID card) should be submitted by e-mail to the following
e-mail address: JRC-IPSC-GRANTHOLDERS at by 15 March 2009
midnight CET.
According to the Vademecum for grant holders (see,
the remuneration is about 54,000 Euro/year plus significant
Automatic Multilingual Text Analysis II
Category: Category 30 (Requires Ph.D. or five years of relevant
post-graduate experience)
Duration: 36 months
Action: OPTIMA 
Remuneration and conditions: see Vademecum
for grantholders

URL generic call:
URL specific post: 
The Internet is the richest reservoir of human knowledge that has ever
existed. Advanced software tools are needed to monitor and process the
vast amount of material available on-line. The Action OPTIMA
(OPensource Text Information Mining and Analysis) develops innovative
solutions for retrieving and extracting information from the Internet
and from other Open Sources. It serves many Commission Services, EU
agencies and some member state authorities. The core of this action is
the Europe Media Monitor (EMM). 

In this action, the person will be working on research activities on
automatic multilingual text analysis. Typical examples of subjects
currently being studied are automatic event extraction, automatic
entity recognition and cross-language clustering. 

These techniques are already to some extent being deployed in several
operational applications and part of the work would be in support of
these applications. The on-going research has a strong focus on
applicability in a multilingual environment 

The work is highly practical and goal oriented. Research results are
expected to be used operationally. The candidate is expected to
contribute to scientific publications of the research results. 

The system within which the results will be deployed is implemented in
Java as a set of servlets in Tomcat. Good programming skills,
preferably in Java are therefore recommended. 

University degree in computer science or computational linguistics. 

Doctoral degree in similar discipline, or equivalent work experience
of 5 years. The working language of the action is English and strong
English language skills are required. Given the multilingual aspect of
the work, active knowledge of at least one other language and an
understanding of at least another one is also required. 

Good knowledge of Arabic, Farsi or Chinese would be seen as an asset. 

Duration : 36 months
Ralf Steinberger (Firstname.Lastname at
European Commission - Joint Research Centre (JRC)
IPSC - Global Security and Crisis Management - OPTIMA (OPensource Text
Information Mining and Analysis)
URL: Applications:
URL: The science behind them:
The JRC’s Language Technology activity specialises in the development
of highly multilingual text analysis tools and in cross-lingual
applications. Many applications are accessible online, e.g.:

· NewsExplorer: multilingual news
  aggregation and analysis (19 languages); allows to navigate the news
  over time and across languages; trend analysis; collects information
  about people from the news; social network detection.

· NewsBrief: breaking news detection and
  display of the very latest thematic news from around the
  world; email alerting (40+ languages).

· MedISys Medical Information System: latest
  health-related news from around the world according to themes and
  diseases (40+ languages).

· EMM-Labs : Latest developments; social
  networks; live people-in-the-news; country and theme fact sheets;
  maps showing violent events world-wide.

JRC-Acquis Multilingual Parallel Corpus (Version 3)
· Freely available for research purposes.

· 22 languages: Bulgarian, Czech, Danish, German, Greek, English,
  Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian,
  Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak,
  Slovene and Swedish.

· Altogether over 1 Billion words.
· Sentence alignment for 231 language pairs, using the two alternative
  aligners Vanilla and HunAlign.
· For more information and download, see
DGT-Translation Memory
· Freely available for research purposes.
· Aligned translation units for 231 language pairs.
· Alignment manually verified.
· For more information and download, see

Message diffuse par la liste Langage Naturel <LN at>
Informations, abonnement :
English version       : 
Archives                 :

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  :

More information about the Ln mailing list