[Corpora-List] Funded PhD Position: Semantic Fields in Lexicography, Computer Science and Cognitive Science

Marco Baroni marco.baroni at unitn.it
Thu Jun 21 20:20:51 UTC 2007


The Language, Interaction and Computation group (CLIC) of the Center
for Mind/Brain Sciences (CIMeC) at the University of Trento, together
with the Institute for Specialised Communication and Multilingualism
of the European Academy (Bolzano/Bozen), announces the availability of
a joint funded studentship in the 3-year Cognitive and Brain Sciences
doctoral program of the University of Trento.

The winning candidate will pursue research on:

Semantic Fields in Lexicography, Computer Science and Cognitive Science

Details on the project are attached below.

For additional information about the CLIC group please visit:

http://www.cimec.unitn.it/clic

For additional information about the Institute for Specialised
Communication and Multilingualism visit:

http://www.eurac.edu/Org/LanguageLaw/Multilingualism/index.htm

Information about the program (including application details) is
constantly updated at:

http://portale.unitn.it/drcobras/


For further information about this project, please write to
marco.baroni at unitn.it

For information about the PhD program in general, contact
phd.cimec at unitn.it


*************************************
*************************************

SEMANTIC FIELDS IN LEXICOGRAPHY, COMPUTER SCIENCE
AND COGNITIVE SCIENCE:
Project Description


ELDIT is a computer-assisted language learning system developed at the
European Academy of Bolzano (Abel and Weber 2000, Gamper and Knapp
2003, Abel et al. 2004, Knapp 2004). The core of the ELDIT system is a
"cross-lingual" learner's dictionary for Italian and German, freely
accessible from:

http://www.eurac.edu/eldit

Among other innovative features, this dictionary offers the
possibility to explore the semantic neighborhood of a word through its
"word field", i.e., a set of closely related words, such as hyponyms,
co-hyponyms, words in a frame relation, etc. (for the theoretical
aspects of word fields see Trier 1934 and Coseriu 1964).

The relations that define word fields in ELDIT have been
chosen on theoretical lexico-semantic grounds, rather than being based
on experimental data determining which relations are more salient for
native speakers and learners. Moreover, word field input in ELDIT has
been performed manually by lexicographers, resulting in a rather small
set of entries.

The current project will tackle both issues. In the first phase, the
candidate will review the cognitive literature on concepts and concept
relations (e.g., Murphy 2002, Garrard et al. 2001), as well as the
linguistic literature about word fields (such as the references)
mentioned above. This will result in the design of behavioural
experiments aimed at determining salient semantic relations for native
and non-native speakers of the target languages (Italian and German).
The results of these experiments will lead to a cognitively motivated
revision of the list of relations encoded in the ELDIT word fields.

In the second phase of the project, the candidate will adapt
computational methods to extract relations and related pairs from
large textual databases and other sources (see, e.g., Almuhareb and
Poesio 2004, Pantel and Pennacchiotti 2006) to the task of
automatically harvesting words in the same field and their relations
(targeting the cognitively justified relations identified in the first
phase). The harvested word fields will then be used to enrich ELDIT.

All through the project, the candidate will closely collaborate with
members of the Center for Mind/Brain Sciences of the University of
Trento, where active neuroscientific, behavioural and computational
research on related areas is currently being conducted, and with
lexicographers and computer scientists at the European Academy in
Bolzano, that will monitor the effectiveness of the project output in
terms of its pedagogical lexicography application in ELDIT.

The ideal candidate will have a strong interest in lexical semantics
from an interdisciplinary perspective, straddling linguistics,
lexicography, cognitive science and computer science and she/he will
have the opportunity to work on an exciting project that promises to
produce both long-ranging theoretical results and a very concrete
practical output of great use to language learners.


References
==========

Abel, A., S. Campogianni and J. Reichert. 2004. Wortfelder in einem
zweisprachigen elektronischen Lernerwörterbuch: Darstellung der
paradigmatischen Bedeutungsbeziehungen in der pädagogischen
Lexikographie am Beispiel von ELDIT. Proceedings of EURALEX 2004,
437-442.

Abel, A. and V. Weber, V. 2000. ELDIT: A Prototype of an innovative
dictionary. Proceedings of EURALEX 2000, 807-818.

Almuhareb, A. and M. Poesio. 2004. Attribute-based and value-based
clustering: an evaluation. Proceedings of EMNLP.

Coseriu, E. 1964. Pour une sémantique diachronique
structurale. Travaux de Linguistique et de Littérature 2, 139-186.

Gamper, J. and J. Knapp, J. 2003. A data model and its implementation
for a Web-based language learning system. In Proceedings of WWW2003,
217-225.

Garrard, P., M. Lambon Ralph, J. Hodges and
K. Patterson. 2001. Prototypicality, distinctiveness and
intercorrelation: Analyses of the semantic attributes of living and
nonliving concepts. Cognitive Neuropsychology 18, 125-174.

Knapp, J. 2004. A new approach to CALL content
authoring. Ph.D. thesis, University of Hannover.

Murphy, G. 2002. The big book of concepts. Cambridge (MA): MIT Press.

Pantel, P. and M. Pennacchiotti. 2006. Espresso: Leveraging generic pat-
terns for automatically harvesting semantic relations. Proceedings of
COLING/ACL-06, 113-120.

Trier, J. 1934. Das Sprachliche Feld: Eine Auseinandersetzung. Neue
Fachbücher für Wissenschaft und Jugendbildung 10, 428-449.



More information about the Corpora mailing list