18.1716, Diss: Computational Ling/Text&Corpus Ling: Pado: 'Cross-Lingual Ann...'

LINGUIST Network linguist at LINGUISTLIST.ORG
Tue Jun 5 20:31:56 UTC 2007


LINGUIST List: Vol-18-1716. Tue Jun 05 2007. ISSN: 1068 - 4875.

Subject: 18.1716, Diss: Computational Ling/Text&Corpus Ling: Pado: 'Cross-Lingual Ann...'

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
 
Reviews: Laura Welcher, Rosetta Project  
       <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Hunter Lockwood <hunter at linguistlist.org>
================================================================  

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.

===========================Directory==============================  

1)
Date: 05-Jun-2007
From: Sebastian Pado < pado at coli.uni-saarland.de >
Subject: Cross-Lingual Annotation Projection Models for Role-Semantic Information

 

	
-------------------------Message 1 ---------------------------------- 
Date: Tue, 05 Jun 2007 16:30:04
From: Sebastian Pado < pado at coli.uni-saarland.de >
Subject: Cross-Lingual Annotation Projection Models for Role-Semantic Information 
 


Institution: Saarland University 
Program: Department of Computational Linguistics and Phonetics 
Dissertation Status: Completed 
Degree Date: 2007 

Author: Sebastian Pado

Dissertation Title: Cross-Lingual Annotation Projection Models for
Role-Semantic Information 

Dissertation URL:  http://www.coli.uni-saarland.de/~pado/pub/papers/phd.shtml

Linguistic Field(s): Computational Linguistics
                     Text/Corpus Linguistics


Dissertation Director(s):
Mirella Lapata
Manfred Pinkal

Dissertation Abstract:

Due to the high cost of manual annotation, resources with role-semantic
annotation exist only for a small number of languages, notably English.
This thesis addresses the resulting resource scarcity problem by 
developing methods to induce role-semantic annotation for new languages
automatically.

We address the induction task with annotation projection, a general
procedure to exchange linguistic information between aligned sentences in a
parallel corpus. Annotation projection is a knowledge-lean approach, and
thus applicable even to resource-poor languages. We evaluate our approach
by using FrameNet, a large English resource for frame semantics, to induce
frame-semantic annotation for two target languages, German and French.

We project semantic classes and roles in two separate steps, since the two
tasks have different profiles. The projection of semantic classes can be
realised using simply by using correspondences between predicates, which
are usually single words. Translational shifts, i.e., translations which
change the semantic class (frame) of the original predicate, can be
filtered out with knowledge-lean filtering mechanisms that rely on
distributional properties. 

In contrast, the projection of semantic roles relies mainly on clean
correspondences between sentential constituents (i.e.,role-bearing
phrases). We show that such correspondences can be obtained by formalising
the task as a graph matching problem that integrates knowledge about
syntactic bracketings. The resulting correspondences show a high precision
even for noisy input data from automatic shallow semantic parsing.

In sum, the results of this thesis indicate that the semantic
generalisations made by frame semantics carry over to a considerable degree
from English to other languages not only on the type, but also on the token
level.  The projection methods we have developed can be applied to robustly
and automatically create frame-semantic resources for new languages. 





-----------------------------------------------------------
LINGUIST List: Vol-18-1716	

	



More information about the LINGUIST mailing list