[Corpora-List] we take national languages at the first step
Tambovtsev: Yuri, Alina and Yuliana
yutamb at mail.ru
Sat May 10 18:56:21 UTC 2008
Dear Corpora colleagues, we take national languages at the first step. So, English is taken in its Queen's English variant. Unfortunately, our group of students is too small to embrace varieties of English. However, it is hard to understand if this is a language or a dialect. For instance, there are 4 dialects in Mansi (Vogul), but in fact they are different languages since their native speaker do not understand each other. Thus, we take only the Nortern dialect of Mansi because we have no time to embrace all Mansi dialects (languages?). At the same time we take Russian, Belorussian and Ukrainian as separate languages, though their sound pictures are quite close and the communication is possible. However, the real problem is that there are no phonetic corpora of Mansi, Hanty, Ket, Sel'kup, Karelian, Hakas, Turkish, Azeri, Russian, Ukrainian, Belorussian and the other world languages. This is why, we had to transcribe the texts ourselves by hand. In future, however, it is advisable to set up phonetic corpora of every dialect or variety of a language, first of all English, for learning reasons as well. Thank you for your questions concerning our project. Remain yours most sincerely Yuri Tambovtsev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080511/60f728dd/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list