[Corpora-List] Transcribed or transliterated texts

Yuri Tambovtsev yutamb at mail.ru
Mon Mar 8 12:36:33 UTC 2010


Dear Corpora members, when someone asked for transcribed or transliterated texts Corpora members advised him English texts in the usual English orthography, just like in Gutenberg collection. May be I don't understand English well enough since I have been studying it for 55 years only, but I understood that the person has the same problem as I had, have and is going to have, i.e. the texts in tanscription. One can call it a Phonetic Corpora, as I do. As far as I understand there is not any reliable programme to transcribe texts by a computer. Anyway, all our texts were transcribed by hand. And then checked and rechecked. Computer transcribe texts with many mistakes. We found it for English, Russian, German and many more of the 168 languages we put into our Phonetic Corpora. Now, it looks that our Corpora allow us to compare all world fed into it with the help of some 9 phonetic features. Thus, we could tell Russian from Belorussian, and the rest Slavonic languages, basing just on the Sound Picture of it. In fact, the sound picture of a language is its Phonetic Corpus. However, if the Corpora Members really know the transcribed corpora of any language, I for one, would be very much interested. Please, let me know by sending me a message to yutamb at mail.ru  Be WELL, remain yours most sincerely and phonetically Yuri Tambovtsev, Novosibirsk, Russia 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20100308/afb3f8ed/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list