13.967, Diss: Computational Ling: Skoumalova "Czech..."

Mon Apr 8 18:39:23 UTC 2002

LINGUIST List:  Vol-13-967. Mon Apr 8 2002. ISSN: 1068-4875.

Subject: 13.967, Diss: Computational Ling: Skoumalova "Czech..."

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
            Andrew Carnie, U. of Arizona <carnie at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Editors (linguist at linguistlist.org):
	Karen Milligan, WSU 		Naomi Ogasawara, EMU
	James Yuells, EMU		Marie Klopfenstein, WSU
	Michael Appleby, EMU		Heather Taylor-Loring, EMU
	Ljuba Veselinova, Stockholm U.	Richard John Harvey, EMU
	Dina Kapetangianni, EMU		Renee Galvis, WSU
	Karolina Owczarzak, EMU

Software: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
          Gayathri Sriram, E. Michigan U. <gayatri at linguistlist.org>

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Karolina Owczarzak <karolina at linguistlist.org>

=================================Directory=================================

1)
Date:  Mon, 08 Apr 2002 12:33:49 +0000
From:  Hana.Skoumalova at ff.cuni.cz
Subject:  Computational Ling: Skoumalova "Czech syntactic lexicon"

-------------------------------- Message 1 -------------------------------

Date:  Mon, 08 Apr 2002 12:33:49 +0000
From:  Hana.Skoumalova at ff.cuni.cz
Subject:  Computational Ling: Skoumalova "Czech syntactic lexicon"

New Dissertation Abstract

Institution: Charles University
Program: Institute of Theoretical and Computational Linguistics
Dissertation Status: Completed
Degree Date: 2001

Author: Hana Skoumalova
Dissertation Title:
Czech syntactic lexicon

Dissertation URL: http://utkl.ff.cuni.cz/~skoumal/dissertation

Linguistic Field: Syntax, Lexicography, Computational Linguistics

Dissertation Director 1: Jarmila Panevova

Dissertation Abstract:

In this work, an electronic lexicon of Czech verbs is presented. The
lexicon contains valency frames of ca 15,000 Czech verbs, and its
purpose is to enrich information contained in other electronic
dictionaries. The trend of recent years is to make large-scale
reusable sources which can be combined with other sources. This work
shows how the lexicon cooperates with an existing morphological
lexicon and how it can be used in various NLP systems.

Chapter 2 discusses several theoretical approaches in comparison with
Functional Generative Description (FGD), which is used for the
dictionary. The explication concentrates especially on the structure
of lexicons in single theories. A lexicon usually conforms certain
preconditions resulting from using a given theoretical framework, and
so the possibility of creating a lexicon which would be transferable
to another theoretical framework is explored.

Chapter 3 discusses the possibility of using existing sources, with
respect to the desired result and the theoretical framework adopted
for the work. There were already several Czech syntactic lexicons
created in the past, but unfortunately their reuse would be rather
difficult. This chapter mentions several such attempts, and describes
in detail a lexicon which is used.

Chapter 4 describes the verb frame. First, the format of the lexical
entry is described, then various types of reflexive constructions in
Czech, and their encoding in the lexicon are discussed. In the next
section, possible diatheses of the basic (active) frame are shown, and
it is also discussed which of these diatheses can be added to the
dictionary on a regular basis and which have to be treated as
exceptions. The last section describes so called equi and raising
verbs.

In Chapter 5, the procedure of automatic conversion of the source
dictionary to the proposed format is shown. For this conversion, an
algorithm was created which assigns the functors (semantic roles) to
single members of a frame. The output of this procedure will serve as
an input for an editor. It is discussed what amount of the source data
can be completed by this procedure and what amount needs
post-editing. It is also shown how the resulting lexicon can be used
in NLP systems.

Chapter 6 sums up. In Section 6.1, verbs are sorted into groups
according their frames, and the results are compared with results of
other researchers. In Section 6.2, perspectives of the language
processing based on symbolic methods are discussed, and the possible
usage of the lexicon in corpus linguistics.

---------------------------------------------------------------------------
LINGUIST List: Vol-13-967