The most and the least typical Romance language
Timo Honkela
timo.honkela at tkk.fi
Mon Apr 26 08:14:05 UTC 2010
Our article "Complexity of European Union Languages: A comparative
approach" in the Journal of Quantitative Linguistics" may be of some
interest for you. The abstract is below, the paper is available at
http://www.informaworld.com/smpp/content~content=a792160699. I can
also send the paper by e-mail upon request.
Best regards,
Timo Honkela (timo.honkela at tkk.fi)
- -
Abstract
In this article, we are studying the differences between the European
Union languages using statistical and unsupervised methods. The
analysis is conducted in the different levels of language: the
lexical, morphological and syntactic. Our premise is that the
difficulty of the translation could be perceived as differences or
similarities in different levels of language. The results are compared
to linguistic groupings. Two approaches are selected for the analysis.
A Kolmogorov complexity-based approach is used to compare the language
structure in syntactic and morphological levels. A morpheme-level
comparison is conducted based on an automated segmentation of the
languages into morpheme-like units. The way the languages convey
information in these levels is taken as a measure of similarity or
dissimilarity between languages and the results are compared to
classical linguistic classifications. The results have a significant
impact on the design of (statistical) machine translation systems. If
the source language conveys information in the morphological level and
the target language in the syntactic level, it is clear that the
machine translation system must be able to transfer the information
from one level to another.
On Mon, 15 Mar 2010, Yuri Tambovtsev wrote:
> The most and the least typical Romance language. We have computed six Romance languages to measure the phono-typological distances between them. It is possible to find the Romance language which has the shortest distance to all these Romance languages. It is Moldavian. The ordered series of the phono-typological distances to the centre of the Romance languages:
> 17.30 Moldavian
> 20.24 - Rumanian
> 20.54 Italian
> 21.73 -Spanish
> 30.27 - Portuguese
> 51.17 - French
> The least typical Romance language is French. What ideas have you got to share with me about the most and the least typical Romance language from the phono-typological point of view? Looking forward to hearing about you to yutamb at mail.ru Yours sincerely Yuri Tambovtsev, Novosibirsk, Russia.
>
>
>
--
Timo Honkela, Chief Research Scientist, PhD, Docent
Adaptive Informatics Research Center
Aalto University School of Science and Technology
P.O.Box 5400, FI-02015 TKK, Finland
timo.honkela at tkk.fi, http://www.cis.hut.fi/tho/
More information about the Funknet
mailing list