19.3617, Diss: Comp Ling/Morphology/Sy ntax: Chrupa?a: 'Towards a Machine-...'

LINGUIST Network linguist at LINGUISTLIST.ORG
Tue Nov 25 17:23:22 UTC 2008


LINGUIST List: Vol-19-3617. Tue Nov 25 2008. ISSN: 1068 - 4875.

Subject: 19.3617, Diss: Comp Ling/Morphology/Syntax: Chrupa?a: 'Towards a Machine-...'

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
 
Reviews: Randall Eggert, U of Utah  
         <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Evelyn Richter <evelyn at linguistlist.org>
================================================================  

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.

===========================Directory==============================  

1)
Date: 25-Nov-2008
From: Grzegorz Chrupa?a < gchrupala at lsv.uni-saarland.de >
Subject: Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing

 

	
-------------------------Message 1 ---------------------------------- 
Date: Tue, 25 Nov 2008 12:13:16
From: Grzegorz Chrupa?a [gchrupala at lsv.uni-saarland.de]
Subject: Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=19-3617.html&submissionid=197575&topicid=14&msgnumber=1
  


Institution: Dublin City University 
Program: PhD in Computer Science 
Dissertation Status: Completed 
Degree Date: 2008 

Author: Grzegorz Chrupa?a

Dissertation Title: Towards a Machine-Learning Architecture for Lexical
Functional Grammar Parsing 

Linguistic Field(s): Computational Linguistics
                     Morphology
                     Syntax


Dissertation Director(s):
Josef van Genabith

Dissertation Abstract:

Data-driven grammar induction aims at producing wide-coverage grammars of
human languages. Initial efforts in this field produced relatively shallow
linguistic representations such as phrase-structure trees, which only
encode constituent structure. Recent work on inducing deep grammars from
treebanks addresses this shortcoming by also recovering non-local
dependencies and grammatical relations. My aim is to investigate the issues
arising when adapting an existing Lexical Functional Grammar (LFG)
induction method to a new language and treebank, and find solutions which
will generalize robustly across multiple languages.

The research hypothesis is that by exploiting machine-learning algorithms
to learn morphological features, lemmatization classes and grammatical
functions from treebanks we can reduce the amount of manual specification
and improve robustness, accuracy and domain- and language-independence for
LFG parsing systems.

Function labels can often be relatively straightforwardly mapped to LFG
grammatical functions. Learning them reliably permits grammar induction to
depend less on language-specific LFG annotation rules. I therefore propose
ways to improve acquisition of function labels from treebanks and translate
those improvements into better-quality f-structure parsing.

In a lexicalized grammatical formalism such as LFG a large amount of
syntactically relevant information comes from lexical entries. It is,
therefore, important to be able to perform morphological analysis in an
accurate and robust way for morphologically rich languages. I propose a
fully data-driven supervised method to simultaneously lemmatize and
morphologically analyze text and obtain competitive or improved results on
a range of typologically diverse languages. 






-----------------------------------------------------------
LINGUIST List: Vol-19-3617	

	



More information about the LINGUIST mailing list