[Corpora-List] Post-doc position in multilingual text processing at the EC's JRC, Italy (deadline extension: 5 May)

Ralf Steinberger ralf.steinberger at jrc.it
Thu Apr 24 12:29:00 UTC 2008


Please accept our apologies for resending the post below. We were able to
extend the initially very short deadline to 5 May.

 

 

 

 

The European Commission's Joint Research Centre (JRC
<http://ec.europa.eu/dgs/jrc/index.cfm> ) in Ispra, at the Lago Maggiore in
Northern Italy has an opening for a post-doc position in multilingual text
analysis (see below). The JRC is running several public news aggregation and
analysis web portals (see http://emm.jrc.it/overview.html) and provides a
number of services to a wide range of international customers. A strong
focus in the JRC's work is on multilinguality and on tools to provide
cross-lingual information access.

 

Applications (3-page
<http://ipsc.jrc.ec.europa.eu/job/appl_form_grantholders.xls> application
form and an updated
<http://ipsc.jrc.ec.europa.eu/job/EU_CV_template_EN.doc> CV in English)
should be submitted by e-mail to the following e-mail address:
JRC-IPSC-GRANTHOLDERS at ec.europa.eu . 

 

According to the Vademecum for grantholders (see
http://ipsc.jrc.ec.europa.eu/showdoc.php?doc=job/VademecumforGholders2008.pd
f), the remuneration is about 54,000 Euro/year plus allowances. 

 

----------------------------------------------------------------------------
--

 

Automatic Multilingual Text Analysis

 

CALL REFERENCE NO. : IPSC/G02/5 

Category: Post-Doc researcher (category 30)

Duration: 36 months

Action: EMM 

Remuneration: see Vademecum for grantholders
<http://ipsc.jrc.ec.europa.eu/showdoc.php?doc=job/VademecumforGholders2008.p
df> 

URL generic call:  http://ipsc.jrc.ec.europa.eu/jobs.php?id=8
URL specific post:  <http://ipsc.jrc.ec.europa.eu/showgrant.php?id=7>
http://ipsc.jrc.ec.europa.eu/showgrant.php?id=7

 

 

In the Web Mining and Intelligence (EMM) activity, the person will be
working on research activities on automatic multilingual text analysis.
Typical examples of subjects being studied currently are automatic event
extraction, automatic entity recognition and cross-language clustering. 

 

These techniques are already being deployed in several operational
applications and part of the work would be in support of these applications.
The on-going research has a strong focus on applicability in a multilingual
environment 

 

A new area of research is the automatic generation of summaries from
multi-document texts, in particular from news article clusters. The work is
highly practical and goal oriented. Research results are expected to be used
operationally. The system within which the results will be deployed is
implemented in Java as a set of servlets in Tomcat. 

 

University degree in computer science or computational linguistics. Doctoral
degree in similar discipline, or equivalent work experience of 5 years. Good
programming skills, preferably in Java are therefore recommended. The
working language of the action is English and strong English language skills
are required. Given the multilingual aspect of the work, active knowledge of
at least one other language and an understanding of at least another one is
also required. 

 

Good knowledge of Arabic would be seen as an asset. 

 

 

Ralf Steinberger ( <mailto:Ralf.Steinberger at jrc.it> Ralf.Steinberger at jrc.it)

European Commission - Joint Research Centre (JRC)
IPSC - SeS - EMM 
URL: Applications: http://emm.jrc.it/overview.html
URL: The science behind them:  <http://langtech.jrc.it/>
http://langtech.jrc.it.

The JRC's Language Technology group specialises in the development of highly
multilingual text analysis tools and in cross-lingual applications. Many
applications are accessible online, e.g.:

*        <http://press.jrc.it/NewsExplorer/> NewsExplorer: multilingual news
aggregation and analysis (19 languages); allows to navigate the news over
time and across languages; trend analysis; collects information about people
from the news; social network detection.

*        <http://press.jrc.it/> NewsBrief: breaking news detection and
display of the very latest thematic news from around the world; email
alerting (22+ languages).

*        <http://medusa.jrc.it/> MedISys Medical Information System: latest
health-related news from around the world according to themes and diseases
(22+ languages).

*       EMM-Labs <http://emm-labs.jrc.it:8080/> : Latest developments;
social networks; live people-in-the-news; country and theme fact sheets;
maps showing violent events world-wide.

 

JRC-Acquis Multilingual Parallel Corpus (Version 3)

*       Freely available for research purposes.

*       22 languages: Bulgarian, Czech, Danish, German, Greek, English,
Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian, Latvian,
Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovene and Swedish.

*       Altogether over 1 Billion words.

*       Sentence alignment for 231 language pairs.

*       For more information and download, see
<http://langtech.jrc.it/JRC-Acquis.html>
http://langtech.jrc.it/JRC-Acquis.html.

 


DGT-Translation Memory

*       Freely available for research purposes.

*       Aligned translation units for 231 language pairs.

*       Alignment manually verified.

*       For more information and download, see
http://langtech.jrc.it/DGT-TM.html.

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080424/596fd63a/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list