Corpora: Postdocs in Linguistic Database Research

Wed May 17 01:40:32 UTC 2000

The Linguistic Data Consortium and the Database Group in the
Department of Computer and Information Science at the University of
Pennsylvania have two 2-year postdoctoral positions available in
linguistic database research, funded by the National Science Foundation.

The positions will involve research in the following
areas: data models and architectures for linguistic databases;
semi-structured query languages and indexing methods for databases of
annotated speech; and the data provenance problem in linguistic
databases.  Candidates will be expected to have completed a PhD in
language engineering, computational linguistics, databases, or a
related field, and have a demonstrated ability to test novel ideas
through prototyping.

Funding for the positions comes from two projects:

1. TalkBank (NSF/Knowledge and Distributed Intelligence)
   http://www.talkbank.org/

This is an interdisciplinary research project to foster
fundamental research in the study of human and animal communication,
by providing standards and tools for creating, searching, and
publishing primary materials via networked computers.

2. Data Provenance (NSF/Digital Libraries)
   http://db.cis.upenn.edu/Research/provenance.html

This project aims to develop new data models, query languages and
storage techniques, permitting information about the origin of a piece
of data to be stored and propagated as data items move through a
series of curated databases.

Further information about the projects is available from the websites.
Anyone who is interested in these positions is invited to contact
Steven Bird <sb at ldc.upenn.edu> and Peter Buneman <peter at cis.upenn.edu>.

--

Linguistic Data Consortium    -    http://www.ldc.upenn.edu/
U Penn Database Group         -    http://db.cis.upenn.edu/