Corpora: Sound picture of known world languages

Van den Heuvel M, Mev MVDH at sun.ac.za
Fri Jan 4 13:10:09 UTC 2002


To add to Yuri's comment - from the perspective of automatic speech
recognition there's of course another use for calculating the frequency of
occurence of particular phonemes in a language. When collecting speech data
for data-driven speech services, it's essential to cover the entire phonetic
inventory of a particular language, in order of the frequency of occurence.
The data should also include the frequency of biphones and triphones. Our
unit has been doing a lot of these recently for the collection of speech
data for 2 Germanic (South African English and Afrikaans) and 3 Bantu
(Xhosa, Zulu and Sesotho) languages.

Maritza van den Heuvel

***

Research Unit for Experimental Phonology
Department of African Languages
Stellenbosch University
South Africa
---------------
Private Bag X1
Matieland
7602

Tel: ++27 21 808 3974
Fax: ++27 21 808 3975
Internet: www.ast.sun.ac.za <http://www.ast.sun.ac.za>

***

-----Original Message-----
From: Yuri Tambovtsev [mailto:yutamb at mail.cis.ru]
Sent: 25 December 2001 15:45
To: corpora at hd.uib.no
Subject: Corpora: Sound picture of known world languages


Dear colleagues, thank you all who answer me. I'd like to answer your
question that was in all your messages. Why it is important to compute the
phonemic frequencies of occurrence in a language. Every language has this or
that unigue sound picture. One can intuitively feel that language A is
different from language B hearing the sound picture of a language. The
phonemic frequencies of occurrence create this or that sound mosaic of a
language. We can compare world languages with each other after we obtain the
sound picture of every world language. Now linguists believe that there are
about 4000 or 5000 languages in the world. However, unfortunately, there are
only 120 data on phonemeic frequency of occurrence I that I could collect
for world languages. This is why, I urge world linguists to join our group
of phoneticians who investigate the sound picture of world languages.
Looking forward to hearing from you soon to my email address:
yutamb at hotmail.com <mailto:yutamb at hotmail.com>  Remain yours most hopefully
Yuri Tambovtsev



More information about the Corpora mailing list