[Corpora-List] Vacancy: Experienced Language Technologist

Katrien Depuydt depuydt at inl.nl
Mon Jun 23 08:26:51 UTC 2008

The Institute for Dutch Lexicology has a vacancy for an experienced 
Language Technologist. Your primary task will be to help develop 
language technology to improve the accessibility of historical documents 
for IMPACT. Work within the scope of other INL projects will also be 
among your responsibilities.//

/IMPACT/ is a new European research project in which the INL is 
participating and which started on 1 January 2008. It is an 'Integrated 
Project' in which several national libraries and research institutes as 
well as two commercial partners are working together. The main purpose 
of IMPACT is to significantly improve the accessibility of historical 

In order to achieve this, IMPACT has set itself the following tasks:

   1. Current OCR software is not suitable for mass digitisation of
      historical documents. Within the project, OCR software will be
      developed that will significantly improve the accuracy of
      state-of-the-art systems, allowing for the first time reliable
      full text mass digitisation of historical documents.
   2. Information in historical documents is not easily accessed by
      modern users because of the historical language barrier. Within
      the project, historical lexica and linguistic processing tools
      will be developed that will enable enriched indexing to provide
      access to historical material with contemporary query techniques.


Your IMPACT-related tasks will concern the development of a toolbox for 
the building and deployment of historical lexica. Both tools and lexica 
will be used for the enhancement of OCR results and for better retrieval 
in historical text material; recognition of Named Entities plays an 
important role in this. You will be working on both the implementation 
and the design of algorithms. Other tasks will be related to data 
processing and tools for data processing.


-          relevant academic background in computational linguistics, 
computer science or applied mathematics

-          demonstrable knowledge of and experience with the development 
and implementation of machine learning, statistic and other 
computerlinguistic algorithms

-          demonstrable experience with software development. Sound 
knowledge of C and C++ is required.

-          ability to work under pressure, as part of a team that must 
achieve good results within a short period of time

-          Preferably you have:

-          experience with Named Entity Processing and knowledge of OCR 

-          a PhD or other research experience

-          knowlegde of and experience with historical text material.


An INL contract for two years. The salary scale indicated for this job 
is -- dependent upon various factors -- either 10 or 11, with a maximum 
of EUR 4.270, - gross per month on the basis of a 38-hour week. In 
addition you will be entitled to 42 days holiday per annum plus a 
holiday allowance, according to the Cao--onderzoekinstellingen.

      Questions and applications

If you have any further questions, please contact Katrien Depuydt 
(Taalbank), INL, Postbus 9515, 2300 RA, Leiden. Ph: +31 (0)71 527 2479, 
email: depuydt at inl.nl. <mailto:depuyd at inl.nl> See also www.inl.nl 
<http://www.inl.nl/> and www.impact-project.eu 
<http://www.impact-project.eu/>. Applications may be sent to dr. 
Jeannine Beeken (managing director), INL, Postbus 9515, 2300 RA, Leiden. 
Email: secretariaat at inl.nl <mailto:secretariaat at inl.nl>.


*Closing date:* 30-06-2008

Katrien Depuydt
Instituut voor Nederlandse Lexicologie
(Institute for Dutch Lexicology)
(Language Database Dept.)
Postbus 9515
NL-2300 RA Leiden

tel.: +31 71 5272479
mail: depuydt at inl.nl

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080623/4abbfaf5/attachment.htm>
-------------- next part --------------
Corpora mailing list
Corpora at uib.no

More information about the Corpora mailing list