[Corpora-List] 2 PhD positions in computational linguistics at the University of Groningen
Joerg Tiedemann
j.tiedemann at rug.nl
Thu Apr 24 14:06:55 UTC 2008
The University of Groningen, Faculty of Arts, Center for Language and
Cognition, announces the following two PhD positions:
PhD position in Parse and Corpus based Machine Translation
(PaCO-MT) (1,0 fte)
PhD position in Dutch Language Investigation of Summarization Technology
(DAISY) (1,0 fte)
The Center for Language and Cognition Groningen (CLCG) is a research
institute within the Faculty of Arts of the University of Groningen. It
embraces all the linguistic research in the faculty. A considerable
number of the researchers participate in the Graduate School for
Behavioral and Cognitive Neurosciences (BCN), and in the Landelijke
Onderzoekschool Taalwetenschap (LOT). Within the CLCG there are six
research groups: Syntax/Semantics, Discourse and Communication, Language
Variation and Change, Computational Linguistics, Neurolinguistics, and
Language and Literacy Development over the Life Span.
Position 1 (Vacancy 208128)
The STEVIN PaCo-MT project aims at developing an open domain hybrid MT
system integrating proper linguistic analysis and syntactic transfer
into a data-driven approach to be used by professional translators.
Translation will be based on transfer (lexical and syntactic) from a
parsed source language sentence into a corresponding target language
structure. From this the final output is generated using information
from a large target language Treebank that will ensure grammaticality
and fluency. The MT application will be developed for the language pairs
Dutch<>English and Dutch<>French. A post editing interface will be
provided to adapt the output to user needs. The Flemish-Dutch consortium
consists of two academic partners (Leuven and Groningen University) and
one industrial partner (Oneliner Language and Business Solutions).
The PhD project within PaCo-MT in Groningen will be focused on building
bilingual resources necessary for our translation approach. We will
emphasize the use of syntactic annotation (in both, source and target
language) in the automatic extraction of bilingual lexical data,
(probabilistic) transfer rules and statistical translation models.
Programming skills are required and knowledge about SMT and alignment
techniques are definitely a plus.
Position 2 (Vacancy 208127)
The aim of STEVIN DAISY is to develop and evaluate essential technology
for automatic summarization of Dutch informative texts. Innovative
algorithms for topic salience detection, topic discrimination,
rhetorical classification of content, sentence compression and text
generation will be implemented.
An important part of the DAISY project concerns sentence generation. The
task of the sentence generation module is to produce actual grammatical
sentences on the basis of such abstract representations, using the
declarative grammar of Alpino as its key knowledge source. Alpino is a
wide-coverage grammar for Dutch, defined as a unification-based grammar,
in which many insights from HPSG have been implemented (examples are the
inheritance hierarchy of lexical types and grammatical rules). There has
been a lot of work on text generation for unification grammars. More
recent work on which we will base our approach includes Carroll and
Oepen (2005).
Although the Alpino grammar can be used to ensure that well-formed
sentences are produced, a further fluency module will be developed to
ensure that the sentences that are produced are natural and appropriate.
Just as parsing needs a (statistical) disambiguation component to select
the appropriate parse from potentially large sets of possible parses, we
need a fluency component to select the most appropriate sentence from
the set of possible sentences given by the grammar. For the fluency
component, we propose to develop a machine-learning method similar in
approach to the disambiguation component of the Alpino parser. The
disambiguation component of Alpino contains a discriminative
maximum-entropy model, trained on the Alpino treebank. For statistical
ranking of competing surface realizations of the same content, we
propose to implement a similar discriminative maximum-entropy model.
Requirements
a MA degree in Computational Linguistics, Computer Science or related field
knowledge of Dutch, or willingness to learn Dutch
ability to work together in a project with members from different institutes
Conditions of Employment
Employment basis: Temporary for specified period.
Duration of the contract: Four years, starting September 1, 2008.
The position requires residence in Groningen, 36 hrs/week research, and
must result in a PhD dissertation. After the first year there will be an
assessment of the candidates results and the progress of the project.
Based on this, it will be decided whether the employment will be
continued. The University of Groningen offers a salary of EUR 2000
gross per month in the first year to EUR 2558 gross per month in the
fourth year.
Additional information about position 1 (PACO-MT) can be obtained from
Dr. Jörg Tiedemann, project supervisor
Tel: +31 50 3635935
Email: J.Tiedemann at rug.nl
Additional information about position 2 (DAISY) can be obtained from
Dr. Gertjan van Noord, project supervisor
Tel: +31 50 3637811
Email: G.J.M.van.Noord at rug.nl
For both positions, you can also contact:
Mrs. Wyke van der Meer, CLCG secretariat
Tel: +31 50 3635806
Email: w.a.van.der.meer at rug.nl
Additional information about the research institute can be obtained
through the following link:
http://www.rug.nl/let/onderzoek/onderzoekinstituten/clcg/index
Application procedure
You can apply for these vacancies before June 1, 2008 by sending your
application to
University of Groningen
Personnel and Organization Department
P.O. Box 72
9700 AB Groningen
The Netherlands
E-mail address: vmp at bureau.rug.nl
Please include:
your curriculum vitae
a copy of your diploma together with a list of grades
a list of publications (if any)
a recent publication or your Master's thesis
letters of two referees
Electronic applications are prefered. Please identify the vacancy number.
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list