Job: Open trainee positions at the EC's Joint Research Centre in Italy (2nd call)

Thierry Hamon thierry.hamon at LIPN.UNIV-PARIS13.FR
Tue Sep 4 15:29:28 UTC 2007

Date: Fri, 31 Aug 2007 11:57:52 +0200
From: Ralf Steinberger <ralf.steinberger at>
Message-id: <008b01c7ebb5$6592dcc0$d547bf8b at IPSC.TLD>

Apologies for multiple postings!


The European Commission's Joint Research Centre (JRC) is advertising
scientific internship positions in a large variety of fields,
including three profiles related to text analysis.


 <> IPSC/G02-5/2007
Web Mining and Information Extraction

 <> IPSC/G02-6/2007
Multilingual text analysis tools

 <> IPSC/G02-7/2007
Political scientist


For the full call, see Below, you find information
on the profile 'Multilingual text analysis tools'.



Location: Ispra, at the Lago Maggiore in Italy, 60 km West of Milan;

Host: European Commission - Joint Research Centre (JRC)

Position: traineeship / internship / stage / Praktikum / tirocino;

Starting date: late 2007 or any time in 2008;

Duration: 3 to 12 months;

Remuneration: 963 Euro per month + travel allowance;

Nationality: Applicants must have the nationality of an EU Member
State, of an Associated EU Candidate Country, an Associated State or a
Developing Country;

Working language: English;

Activity: Language Technology, Web Technology; many other subject

URL:               <>,  <>,  <>; 

Deadline:         Open call. First cut-off date: Tuesday 14 September 2007

Contact:          JRC-IPSC-STAGE AT



The European Commission's Joint Research Centre in Italy is seeking
students or recent graduates to spend an internship with our motivated
and successful multinational team of scientists and developers
producing concrete and widely used applications. Successful applicants
will want to produce hands-on results and to work in a team. The
trainees will learn about our multilingual text analysis tools
(covering between 19 and 32 languages) and their integration into
complex and highly used web portals: our news analysis pages are
visited with up to 1.2 Million hits per day. The trainees will also
get experience of working in the multilingual, multinational,
multi-disciplinary environment of an international organisation.


Depending on your profile, you can expect to work on one or more of
the following subject areas:


- Information Extraction: named entities, relations, event scenarios,

- Symbolic or statistical approaches;

- Writing English event and relation extraction rules;

- Document Clustering, Categorisation (Classification;

- Terminology extraction, multilingual lexicology;

- Social networks;

- Visualisation;

- Topic detection and tracking, Trend detection;

- Adapting the JRC's tool set to new languages;

- Web log analysis for our applications;

- Applying text analysis tools to the medical or political domains;

- Mining the NewsExplorer <> name

- JAVA re-implementation of PERL programs;

- ...


Applicants must have good programming skills in JAVA or PERL and must
be able to use English as a working language.


Experience with one or more of the following would be a plus:
databases, web technology, XML, knowledge of several natural languages
(even passive), knowledge of - or interest in - medicine or political
science, experience of working with thesauri, ontologies,


If you are interested in this opportunity and you feel that you can
contribute to any of the tasks mentioned above, please follow the
instructions given at Please
carbon-copy your email application to Ralf.Steinberger AT


For information on the European Commission's Joint Research Centre and
its Web and Language Technology group, see
<> . For more information on traineeships, cost
of living, etc., see


When applying, please follow the instructions given on the web page
for the call. If you could send a copy of your application to Ralf
Steinberger, this would be useful.



Ralf Steinberger (Ralf.Steinberger AT 
European Commission - Joint Research Centre (JRC)
IPSC - SeS - Language Technology ( <>,  <> 


JRC-Acquis Multilingual Parallel Corpus (Version 3)

.  Freely available for research purposes.

.  22 languages: Bulgarian, Czech, Danish, German, Greek, English,
Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian,
Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovene
and Swedish.

.  Altogether over 1 Billion words.

.  Sentence alignment for 231 language pairs.

.  For more information and download, see


The JRC's Language Technology group specialises in the development of
highly multilingual text analysis tools and in cross-lingual
applications. Many applications are accessible online, e.g.:

.  <> NewsExplorer: multilingual news
aggregation and analysis (19 languages); allows to navigate the news
over time and across languages; trend analysis; collects information
about people from the news; social network detection.

.  <> NewsBrief: breaking news detection and
display of the very latest thematic news from around the world; email
alerting (22+ languages).

.  <> MedISys Medical Information System: latest
health-related news from around the world according to themes and
diseases (22+ languages).

Message diffuse par la liste Langage Naturel <LN at>
Informations, abonnement :
English version       : 
Archives                 :

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  :

More information about the Ln mailing list