13.62, Disc: Phonetic Frenquencies & "Corpus Phonetics"

LINGUIST List linguist at linguistlist.org
Fri Jan 11 19:20:17 UTC 2002


LINGUIST List:  Vol-13-62. Fri Jan 11 2002. ISSN: 1068-4875.

Subject: 13.62, Disc: Phonetic Frenquencies & "Corpus Phonetics"

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
            Andrew Carnie, U. of Arizona <carnie at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Editors (linguist at linguistlist.org):
	Karen Milligan, WSU 		Naomi Ogasawara, EMU
	Jody Huellmantel, WSU		James Yuells, WSU
	Michael Appleby, EMU		Marie Klopfenstein, WSU
	Ljuba Veselinova, Stockholm U.	Heather Taylor-Loring, EMU
	Dina Kapetangianni, EMU		Richard Harvey, EMU
	Karolina Owczarzak, EMU		Renee Galvis, WSU

Software: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
          Gayathri Sriram, E. Michigan U. <gayatri at linguistlist.org>

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.



Editor for this issue: Karen Milligan <karen at linguistlist.org>

=================================Directory=================================

1)
Date:  Fri, 11 Jan 2002 10:02:55 +0000
From:  "Mark Jones" <markjjones at hotmail.com>
Subject:  Re: 13.50, Disc: New: Phonetic Frenquencies & "Corpus Phonetics"

-------------------------------- Message 1 -------------------------------

Date:  Fri, 11 Jan 2002 10:02:55 +0000
From:  "Mark Jones" <markjjones at hotmail.com>
Subject:  Re: 13.50, Disc: New: Phonetic Frenquencies & "Corpus Phonetics"

Whilst it cannot be doubted that this is an interesting and laudable
idea, there are problems inherent in a corpus approach to
phonetics/phonology (the distinction is unclear in the original
post). Something like this is needed - books like Maddieson's Patterns
of Sounds (CUP 1984) form the basis of much interesting work in
phonological universals, and many interesting phonetic sketches of
languages have been produced which occasionally make it into journals
such as JIPA.

However, ideally phonetic analysis takes repetitions of a single
variable from several speakers under strictly controlled conditions,
and reading a long connected text may not produce enough data for
analysis or controlled enough conditions. For example, the vowel /a/
may occur twenty times, but in different segmental, prosodic,
intonational and positional contexts, all of which can affect factors
such as duration and formant frequencies. And if two speakers of the
same language read the same long text, there may be variations in
rhythm and pausing which are not apparent in shorter sentences, such
as the ones normally used in phonetic analysis. So the phonetic
utility of such texts is doubtful.

In phonological terms, not only does a text not provide the necessary
data for deciding which oppositions are contrastive, it may not give
examples of all phonemes for a language. So for Amharic, the ejective
/p'/ may not occur, and in English /T/ [theta] may not crop up. The
'marginal' nature of such phonemes is not uninteresting, but larger
patterns may reflect historical accidents. In English, for example,
very few instances of words with a long vowel + /b, d, g/ occur,
e.g. league, barb (for non-rhotic speakers like me). Final /d/ is
fairly common. Recent coinings like Beeb for BBC show that there is no
phonological constraint on such words occurring, but they don't crop
up as regularly as their counterparts with voiceless codas (e.g. beat,
meat, seep, sheep, soup, seek, park) for whatever historical
reasons. A random text may not show any such words with a voiced
plosive, and lead one to conclude that English, like German, does not
allow phonologically voiced plosives in coda position.

So I think a corpus approach should not be based on connected texts,
but on more traditional phonetic and phonological approaches. A purely
text based approach also has the drawback that unwritten languages
cannot be represented.

I look forward to reading what other List users think.

Mark Jones


---------------------------------------------------------------------------
LINGUIST List: Vol-13-62



More information about the LINGUIST mailing list