23.4556, Review: Computational Linguistics; Phonetics: Ladefoged & Ferrari Disner (2012)
linguist at linguistlist.org
linguist at linguistlist.org
Wed Oct 31 18:28:32 UTC 2012
LINGUIST List: Vol-23-4556. Wed Oct 31 2012. ISSN: 1069 - 4875.
Subject: 23.4556, Review: Computational Linguistics; Phonetics: Ladefoged & Ferrari Disner (2012)
Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
Reviews: Veronika Drake, U of Wisconsin Madison
Monica Macaulay, U of Wisconsin Madison
Rajiv Rao, U of Wisconsin Madison
Joseph Salmons, U of Wisconsin Madison
Anja Wanner, U of Wisconsin Madison
<reviews at linguistlist.org>
Homepage: http://linguistlist.org
Do you want to donate to LINGUIST without spending an extra penny? Bookmark
the Amazon link for your country below; then use it whenever you buy from
Amazon!
USA: http://www.amazon.com/?_encoding=UTF8&tag=linguistlist-20
Britain: http://www.amazon.co.uk/?_encoding=UTF8&tag=linguistlist-21
Germany: http://www.amazon.de/?_encoding=UTF8&tag=linguistlistd-21
Japan: http://www.amazon.co.jp/?_encoding=UTF8&tag=linguistlist-22
Canada: http://www.amazon.ca/?_encoding=UTF8&tag=linguistlistc-20
France: http://www.amazon.fr/?_encoding=UTF8&tag=linguistlistf-21
For more information on the LINGUIST Amazon store please visit our
FAQ at http://linguistlist.org/amazon-faq.cfm.
Editor for this issue: Rajiv Rao <rajiv at linguistlist.org>
================================================================
Date: Wed, 31 Oct 2012 14:27:57
From: Seetha Jayaraman [seetha.jay at gmail.com]
Subject: Vowels and Consonants
E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=23-4556.html&submissionid=4556993&topicid=9&msgnumber=1
Discuss this message:
http://linguistlist.org/pubs/reviews/get-review.cfm?subid=4556993
Announced at http://linguistlist.org/issues/23/23-2477.html
AUTHOR: Peter Ladefoged and Sandra Ferrari Disner
TITLE: Vowels and Consonants
EDITION: Third
PUBLISHER: Wiley-Blackwell
YEAR: 2012
Seetha Jayaraman, Dhofar University, Sultanate of Oman
SUMMARY
This book, written by Peter Ladefoged and revised by Sandra Ferrari Disner,
contains sixteen chapters on topics ranging from the basics of speech sounds to
an advanced description of acoustic features and the role of computers in studying
acoustic components of speech. The chapters cover perspectives on speech
production and perception and give an overview of phonatory and articulatory
processes involved in the production of different categories of speech sounds,
viz., vowels and consonants. The last three chapters deal with articulatory
differences found in different languages around the world.
The volume provides an exhaustive list of illustrations of sounds discussed in
each chapter and audio-recordings, photographs and videos of vocal tract
configurations are made available on the website
www.linguistics.ucla.edu./faciliti/sales/software.htm. A table lists the audio-
recordings supporting the volume that are available on the website.
A chapter-wise summary follows:
Chapter 1, "Sounds and Languages", begins with the definition of 'sound' and the
distinction between 'sound' and 'language'. It discusses how languages evolve and
disappear constantly with changes in the socioeconomic conditions of people and
their cultural practices. It reflects on the importance of individual sounds, different
aspects of language and speech, and the role they play in our life. The chapter
also describes speech sounds and sound symbols (i.e. International Phonetic
Alphabet) vis à vis their orthographic representations, with an introduction to the
basic components of speech, viz., pitch and loudness and their representations in
a waveform.
Chapter 2, "Pitch and Loudness", discusses 'tones' in terms of pitch and meaning
change associated with pitch, drawing upon examples from tone languages like
Chinese (Mandarin) and Cantonese. The fundamental concepts in understanding
pitch levels, pitch curves and intonation, with reference to the speaker, are
explained. The last section of this chapter outlines the importance of vocal folds,
their position in sound production and the influence of vocal fold vibration on
loudness, in general, and on English intonation, in particular.
The next three chapters (3,4 and 5) present a description of vowel features, the
vowel chart, the vowel space, and acoustic characteristics that help identify
vowels in spectrograms with respect to the structure of the first three formants.
Chapter 3, "Vowel Contrasts", compares vowels across languages like Spanish,
Hawaiian, Swahili and Japanese in order to bring out differences in their usage:
some examples are 'masa' (dough) and 'mesa' (table) in Spanish; 'kaka' (to rinse)
and 'keka' (turnstone) in Hawaiian; 'pata' (hinge) and 'peta' (bend) in Swahili; and
'ma' (interval) and 'me' (eye) in Japanese. This chapter also highlights the
differences between General American English and Standard British English in
their use of vowels. General American English consists of only 14 or 15 vowels,
while British English consists of as many as 20 vowel sounds.
Chapter 4, " The Sounds of Vowels", gives an account of both the acoustic
characteristics of vowel quality and formant patterns in spectrograms as evidence
for vowels. There is a detailed explanation of the interplay between the first two
formant values and the vowel space. When the pitch changes associated with
vowel changes are plotted in a graph with F1 and F2 (frequencies, as we hear
them in different languages), the resultant figure is a triangle. Given that the
auditory space for the three possible vowels /i/, /a/, and /u,/ the vowel space in the
graph shows a triangular shape. With languages having 5 to 7 vowels, it is
possible to have an equally symmetrical triangular shape when we plot F1 vs. F2;
this same shape for any language provides evidence of a relationship between
vowel quality and formant frequencies.
Chapter 5, "Charting Vowels", continues the discussion on formant analysis and
charting of vowels through the first two formants, comparing the five vowels of
Spanish with those occurring in different accents in English. The relative vowel
space plotted for the Spanish vowels /i, e, a, o, u/ is compared with that of
English. There is a tendency to replace diphthongs with their corresponding
monophthongs in some North American accents. With the exception of the vowel
in words like 'bird', the third formant is not significant in the description of vowels
in General American English.
The next chapter, "The Sounds of Consonants", provides an introduction to
consonants and suggests that there is no significant difference in consonant
articulation between British and American varieties of English. The phonetic
symbols used and the articulatory and acoustic features of consonants are
described. This chapter provides background information on different classes of
consonants, viz., stops, approximants, nasals, fricatives and affricates.
Interpretation of spectrograms with respect to both voiceless and voiced
consonants is explained as well. The first three formant frequency values, their
levels, formant transitions for stops, nasals and approximants, and additional
spectral cues which help in the identification of individual consonants are
illustrated with examples from General American English and BBC English.
Chapter 7, "Acoustic Components of Speech", analyzes formant frequency,
amplitude and pitch, combining and varying their auditory correlates of voicing and
voicelessness. Speech synthesis is also discussed, as well as and the
relationship between acoustic variables in the waveform, which are illustrated for
the English word 'bird'.
Chapter 8, "Talking Computers", continues with the topic of synthesizing speech
sounds, with phonetic transcription being the focus of the last part of this chapter.
Two approaches to speech synthesis are suggested: parametric synthesis, where
a computer calculates acoustic parameters like formant frequencies from the
waveform or joins sound segments to make new sentences; the concatenative
approach, in which large sections of speech are stored and subsequently joined
together. The problem with the first approach is that we do not know the rules of
joining one sound to another. The second approach is useful in synthesizing
recordings of telephone numbers and reproducing them for providing pre-recorded
information. The computer uses a mathematical technique called Linear Prediction
Coefficient (LPC) analysis, which uses LPCs, or a set of numbers that represent
everything about voice quality except its fundamental frequency or pitch. A
detailed account of LPC analysis is also given in Ladefoged (1996). Another
system called Pitch Synchronous Overlap Add (PSoLA) is also employed either by
lowering or raising the pitch of the original recording or by recording the variation in
duration. The last section of the chapter deals with studying segmental errors
when using Text To Speech (TTS) systems in intonation. Spelling out all
abbreviations and numbers using IPA symbols is a prerequisite in TTS.
Chapter 9, "Listening Computers", contains an account of the way sounds are
recognized and displayed on a computer. The chapter illustrates the spectral
representation of the first three formants in the word 'August'. Identifying individual
sounds with spectral cues is another dimension viewed in this chapter. The author
acknowledges Fred Jelinek's contribution to speech recognition and lists out the
stages involved in the speech recognition system. He also considers the term
'cepstral coefficient', which refers to measures of spectral slices stored as a
number and reflects the rise and fall in the amplitude of F1, F2 and F3 in a
spectrum or spectral curves. Computers also use the Hidden Markov Model
(HMM), which is a representation of a sequence of speech events.
Chapter 10, "How We Listen to Speech", deals with different ways of listening for
phonetically confusable sounds that impede intelligibility. A confusion matrix for
syllables with different initial consonants and noise levels is shown on a table. The
premise of the table is the way these sounds are heard by a set of listeners. The
confusion matrices tell us the level of confusion and the degree of similarity
between the sounds using the syllables 'pa', 'ta', 'ka', and so on. The higher the
number of correctly heard syllables, the less confusion there is. Perceptual
differences are calculated using 16 sets of syllables. This chapter also reports the
results of an experiment conducted with the words 'bad' and 'bat' (voicing contrast)
to study variation in perception. This is the only chapter that provides sources for
further reading on the topics discussed.
Chapter 11, "Making English Consonants", deals with the physiology of the vocal
apparatus and the articulatory terms associated with the description of place and
manner of articulation of consonants in general. The table of IPA symbols of
English consonants is presented with a brief description of each class of
consonants.
Chapter 12, "Making English Vowels", describes the anatomy and physiology of
vocal organs and the muscles controlling the movements of the tongue in the
production of vowels. There is an interesting account of Melville Bell's symbols, as
given in his Visible Speech (1867), representing vowels in English. The position
and shape of the tongue and palate in the production of vowels relating to the
vowel diagram are analyzed in detail.
Chapter 13, "Actions of the Larynx", talks about the important role played by the
larynx, pharynx, vocal folds, and cartilage and the changes they bring to the
quality of sounds (viz., voiced and voiceless sounds). Voicing and aspiration are
two important features in the production of stop consonants. An important feature
among these is aspiration and Voice Onset Time (VOT), which vary amongst
languages. The interval between the release of a stop and the beginning of the
following vowel is called Voice Onset Time (VOT). In English VOT is 50-60
milliseconds (ms) for /k/ and slightly less for /t/ and /p/, while in Spanish the VOT
for /k/ is about 20 ms and even less for /p/. It is interesting to note that Germanic
languages like English, German and Danish have comparatively longer VOTs. In
Romance languages like French and Spanish, there is no VOT of voiceless stops,
while English and other Germanic languages have voiced stops, which contrast
with voiceless stops. In terms of vocal fold vibration, glottal stop consonants like
/h/ are found to be replaced by /k/ or /p/ in some dialects of British English, as well
as in Hawaiian. Examples from Hindi also show the occurrence of four breathy
voiced stop consonants, while Gujarati has breathy voiced vowels. The effect of
creaky voice and breathy voice on Zapotec vowels is discussed briefly. Other
classes of sounds discussed are 'ejectives', common in a few American Indian
and a few African languages, and 'implosives', which are produced with air sucked
in and found in some languages spoken in Nigeria (e.g. Owerri lagbo). The
mechanism involved in producing implosives is illustrated through differences in
airflow and air stream in the larynx and the vocal tract.
Chapter 14, "Consonants Around the World", is a summary of consonants in
languages. A general survey shows that there are about 7,000 languages in the
world and over half of them are spoken by fewer than 10,000 people. In all, there
are about 600 consonants. The chapter lists the 10 most widely spoken languages
which have 100 consonants (of which, 22 occur in English). A few languages like
Ewe, spoken in Ghana, use two unique bilabial fricatives. Subtle differences which
exist in the production of /t/ in Wabuy (a language spoken in Australia) palatals in
Hungarian, stops and six nasals in Malayalam, voiceless stops in Aleut, and
bilabial and alveolar trills in Kele and Titan, respectively, are detailed. Likewise,
F1-F2 transitions (palatals) in palatograms and linguagrams of the retroflex /ţ/,
Polish sibilants and four sibilants of Toda (and their corresponding IPA symbols),
are also discussed exhaustively. Laterals in Melpa are noted for their manner of
their articulation, as they are complex in symbols, viz., voiced alveolar /l/ and
voiceless velar /ł/ (dark /l/, represented by a small uppercase L in IPA). In Zulu,
laterals occur as voiced and voiceless consonants and clicks occur contrastively.
Nama, a language spoken in Namibia, has 20 clicks, each represented by an IPA
symbol and with different meanings.
Chapter 15, "Vowels Around the World", demonstrates the relation between vowel
space and the graphic display of F1-F2. Contrasts are made between languages
like Hawaiian, which has 5 vowels and only 8 consonants, and those such as Zulu,
which has 5 vowels and 44 consonants. Every language is said to use at least 3
distinct vowels, viz., /i , a, u/ or /i, a, o/. About 20% of the world's languages have
5 contrasting vowels. An interesting fact is that most languages with 5 vowels
follow the same order of the Latin alphabet (a, e, i, o, u). Californian English has 15
vowels and BBC English has 20 vowels (12 long, 10 short and 6 diphthongs) with
varying tongue roots. Lip rounding also plays an important part in the articulation of
vowels. French has rounded vowels like /y/, as in 'lu' (a front, high, rounded
vowel). The other rounded vowels which occur in French are /œ/ as in 'leur' (their),
/ø/ as in 'le' (the), /o/ as in 'lot' (prize), /ɔ/ аs in 'lors' and /ɑ/ as in 'las' (tired).
Swedish, Danish, Norwegian and German also have rounded/unrounded vowel
contrasts. Nasal vowels versus nasalized vowels are observed in English and
French, respectively, as in the vowels in the words 'lin' (flax), 'lundi' (Monday),
'lent' (slow) and 'long' (long). Phonetic differences in vowels are observed with
distinctions in voice quality, as in !Xóō vowels (a Bushman language spoken in the
Kalahari desert) or tense-voiced vowels in Mpi (spoken in Northern Thailand).
The last chapter in this volume, "Putting Vowels and Consonants Together",
summarizes vowels and consonants, puts them together as 'utterances', and talks
about the speech continuum in terms of duration and intelligibility. It is a common
observation that slips of the tongue, which interchange the sounds of syllables,
occur in speech. The other aspects discussed are writing systems and sounds,
tones and languages like Chinese (Mandarin) and Cantonese. The role of IPA in
representing /r/ and its variants in languages other than English, contrasting
sounds, and so on, are emphasized. In all, 106 distinct symbols for segments (78
consonants and 28 vowels), excluding sounds like ejectives and diacritics, are
represented in the IPA chart provided. Sounds are also transcribed using symbols
like )( (not an IPA symbol) for 'hiss' or 'sing'. The totality of features required to
describe a language at a glance is shown in a single table (Table 16.2 on page
196).
EVALUATION
The book is an excellent introduction to the basics of speech sounds. The number
of books available on phonetics is innumerable, but "Vowels and Consonant" is
undoubtedly one of the best books on the basics. It is a good example of how
complex topics like acoustic phonetics, speech synthesis, speech recognition, the
physiology of speech production and sound-spelling correlation can be simplified to
be accessible for beginners in phonetic studies. It requires and assumes no prior
knowledge, either of phonetics or the process of speech production, on the part of
the reader. Each chapter is introductory in nature and technical terminology has
been used sparingly while explaining the basics of both articulatory and acoustic
phonetics. The topics cover a wide range, from traditional definitions of phonetic
terms and an IPA chart, to the latest trends in TTS systems used in speech
technology. The last three chapters are dense and rich in content, and consonant
and vowel sounds across different languages of the world (the most widely
spoken) have been discussed extensively, clearly and concisely.
Chapter 6 is especially effective because it equips the reader with all the details of
consonant features with remarkable clarity and precision. Chapters 14 and 15 of
the volume also merit special mention due to their coverage of examples from all
the distinctive sounds of a few lesser known, yet widely spoken languages. The
detail in these two chapters aptly justifies the title of the book.
The volume is a valuable contribution for researchers and scholars working on
consonants and vowels across different languages. It serves as a good
introductory textbook for a course on phonetics. The highlight of the third edition of
"Vowels and Consonants" is the demos of some Text-to Speech Systems such as
videos of vibrating vocal cords, audio recordings of articulations of vowels and
illustrations of IPA symbols. As stated in the Preface to the Third Edition, "The CD
that had accompanied the previous edition has been replaced with a more readily
accessible web-based collection of language files" (p. xv). The volume serves as a
ready reference for advanced users of phonetics, as well as professionals and
research scholars of language and speech. The book is of interest to teachers and
would help to develop readers' perception of speech production and their
competence in spoken English. It is a 'must have' book that adds richness and
knowledge to individuals and libraries.
REFERENCES
Ladefoged, P. (1996). A Course in Phonetics, (2nd Ed.). Chicago, Chicago
University of Press.
ABOUT THE REVIEWER
Dr. Seetha Jayaraman is a Lecturer at Dhofar University, Sultanate of
Oman, where she teaches English language to undergraduates. Her
research interests include sociolinguistics, musicology, comparative
linguistics, and phonetics.
----------------------------------------------------------
LINGUIST List: Vol-23-4556
----------------------------------------------------------
More information about the LINGUIST
mailing list