<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=koi8-r">

<META content="MSHTML 6.00.2900.3157" name=GENERATOR>

<STYLE></STYLE>

</HEAD>

<BODY bgColor=#ffffff>

<DIV><FONT face=Arial size=2>Dear Corpora colleagues, we take national languages 

at the first step. So, English is taken in its Queen's English variant. 

Unfortunately, our group of students is too small to embrace varieties of 

English. However, it is hard to understand if this is a language or a dialect. 

For instance, there are 4 dialects in Mansi (Vogul), but in fact they are 

different languages since their native speaker do not understand each other. 

Thus, we take only the Nortern dialect of Mansi because we have no time to 

embrace all Mansi dialects (languages?). At the same time we take Russian, 

Belorussian and Ukrainian as separate languages, though their sound pictures are 

quite close and the communication is possible. However, the real problem is that 

there are no phonetic corpora of Mansi, Hanty, Ket, Sel'kup, 

Karelian, Hakas, Turkish, Azeri, Russian, Ukrainian, Belorussian and 

the other world languages. This is why, we had to transcribe the texts ourselves 

by hand. In future, however, it is advisable to set up phonetic corpora of 

every dialect or variety of a language, first of all English, for learning 

reasons as well. Thank you for your questions concerning our 

project. Remain yours most sincerely Yuri Tambovtsev</FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV></BODY></HTML>