21.221, Diss: Comp Ling: Junczys-Dowmunt: 'German Compound Nouns and their...'
linguist at LINGUISTLIST.ORG
linguist at LINGUISTLIST.ORG
Thu Jan 14 15:51:23 UTC 2010
LINGUIST List: Vol-21-221. Thu Jan 14 2010. ISSN: 1068 - 4875.
Subject: 21.221, Diss: Comp Ling: Junczys-Dowmunt: 'German Compound Nouns and their...'
Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
Reviews: Monica Macaulay, U of Wisconsin-Madison
Eric Raimy, U of Wisconsin-Madison
Joseph Salmons, U of Wisconsin-Madison
Anja Wanner, U of Wisconsin-Madison
<reviews at linguistlist.org>
Homepage: http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University,
and donations from subscribers and publishers.
Editor for this issue: Di Wdzenczny <di at linguistlist.org>
================================================================
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.
===========================Directory==============================
1)
Date: 11-Jan-2010
From: Marcin Junczys-Dowmunt < junczys at amu.edu.pl >
Subject: German Compound Nouns and their Polish Equivalents: Automatic extraction, analysis and verification based on parallel corpora
-------------------------Message 1 ----------------------------------
Date: Thu, 14 Jan 2010 10:49:33
From: Marcin Junczys-Dowmunt [junczys at amu.edu.pl]
Subject: German Compound Nouns and their Polish Equivalents: Automatic extraction, analysis and verification based on parallel corpora
E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=21-221.html&submissionid=2373386&topicid=14&msgnumber=1
Institution: Adam Mickiewicz University
Program: Linguistics Program
Dissertation Status: Completed
Degree Date: 2009
Author: Marcin Junczys-Dowmunt
Dissertation Title: German Compound Nouns and their Polish Equivalents:
Automatic extraction, analysis and verification based on
parallel corpora
Dissertation URL: http://www.staff.amu.edu.pl/~junczys/index.php?title=Publications
Linguistic Field(s): Computational Linguistics
Subject Language(s): German, Standard (deu)
Polish (pol)
Dissertation Director(s):
Krzysztof Jassem
Jerzy Pogonowski
Dissertation Abstract:
We apply methods first used for statistical machine translation to the
automatic extraction and analysis of German compound nouns and their Polish
equivalents.
A large German-Polish parallel corpus is used as the main source of data.
In the course of this work several new applications are developed and
described.With the help of these applications a set of more than 140,000
unique German compound nouns is created for which we are able to identify
more than 200,000 unique Polish counterparts in the corpus.
>From this data we extract several subsets of equivalence pairs that have
been filtered either automatically or half-automatically. Additionally, one
manually verified subset of equivalence pairs is created. These data sets
serve as reference material for the verification of several claims from
other works in contrastive linguistics that based their results on much
smaller amounts of data. Apart from that we supply information about the
quantitative distribution of German compound nouns and their Polish
equivalents which has not been provided in any earlier work.
-----------------------------------------------------------
LINGUIST List: Vol-21-221
More information about the LINGUIST
mailing list