The most and the least typical Romance language

Timo Honkela timo.honkela at tkk.fi
Mon Apr 26 08:14:05 UTC 2010


Our article "Complexity of European Union Languages: A comparative 
approach" in the Journal of Quantitative Linguistics" may be of some 
interest for you. The abstract is below, the paper is available at
http://www.informaworld.com/smpp/content~content=a792160699. I can 
also send the paper by e-mail upon request.

Best regards,
Timo Honkela (timo.honkela at tkk.fi)

- -

Abstract

In this article, we are studying the differences between the European 
Union languages using statistical and unsupervised methods. The 
analysis is conducted in the different levels of language: the 
lexical, morphological and syntactic. Our premise is that the 
difficulty of the translation could be perceived as differences or 
similarities in different levels of language. The results are compared 
to linguistic groupings. Two approaches are selected for the analysis. 
A Kolmogorov complexity-based approach is used to compare the language 
structure in syntactic and morphological levels. A morpheme-level 
comparison is conducted based on an automated segmentation of the 
languages into morpheme-like units. The way the languages convey 
information in these levels is taken as a measure of similarity or 
dissimilarity between languages and the results are compared to 
classical linguistic classifications. The results have a significant 
impact on the design of (statistical) machine translation systems. If 
the source language conveys information in the morphological level and 
the target language in the syntactic level, it is clear that the 
machine translation system must be able to transfer the information 
from one level to another.


On Mon, 15 Mar 2010, Yuri Tambovtsev wrote:

> The most and the least typical Romance language. We have computed six Romance languages to measure the phono-typological distances between them. It is possible to find the Romance language which has the shortest distance to all these Romance languages. It is Moldavian. The ordered series of the phono-typological distances to the centre of the Romance languages:
> 17.30 Moldavian
> 20.24 - Rumanian
> 20.54 Italian
> 21.73 -Spanish
> 30.27 - Portuguese
> 51.17 - French
> The least typical Romance language is French. What ideas have you got to share with me about the most and the least typical Romance language from the phono-typological point of view? Looking forward to hearing about you to yutamb at mail.ru  Yours sincerely Yuri Tambovtsev, Novosibirsk, Russia.
>
>
>


--
Timo Honkela, Chief Research Scientist, PhD, Docent
Adaptive Informatics Research Center
Aalto University School of Science and Technology
P.O.Box 5400, FI-02015 TKK, Finland

timo.honkela at tkk.fi,  http://www.cis.hut.fi/tho/



More information about the Funknet mailing list