8.334, Review: WinSAL5, SpeechLAB

Sat Mar 8 21:00:43 UTC 1997

LINGUIST List:  Vol-8-334. Sat Mar 8 1997. ISSN: 1068-4875.

Subject: 8.334, Review: WinSAL5, SpeechLAB

Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at linguistlist.org>
            Helen Dry: Eastern Michigan U. <hdry at linguistlist.org>
            T. Daniel Seely: Eastern Michigan U. <seely at linguistlist.org>

Review Editor:     Andrew Carnie <carnie at linguistlist.org>

Associate Editors: Ljuba Veselinova <ljuba at linguistlist.org>
                   Ann Dizdar <ann at linguistlist.org>
Assistant Editor:  Sue Robinson <sue at linguistlist.org>
Technical Editor:  Ron Reck <ron at linguistlist.org>

Software development: John H. Remmers <remmers at emunix.emich.edu>
                      Zhiping Zheng <zzheng at online.emich.edu>

Home Page:  http://linguistlist.org/

Editor for this issue: Andrew Carnie <carnie at linguistlist.org>
 ==========================================================================

What follows is another discussion note contributed to our Book Discussion
Forum.  We expect these discussions to be informal and interactive; and
the author of the book discussed is cordially invited to join in.

If you are interested in leading a book discussion, look for books
announced on LINGUIST as "available for discussion."  (This means that
the publisher has sent us a review copy.)  Then contact Andrew Carnie at
     carnie at linguistlist.org

=================================Directory=================================

1)
Date:
From:
Subject:

-------------------------------- Message 1 -------------------------------

Date:
From:
Subject:

Review of:
WinSAL-V SPEECHLAB
Media Enterprise-Ingolf Franke, Manager
Technolgie-Zentrum Trier
Gottbillstrasse 34a, D-54294 Trier
499DM

  The WinSAL-V Speech Signal Processing with Video Option is an integrated
speech analysis and phonetics teaching package, distributed on CD-ROM.  It
was developed in Germany, requires Windows 3.1 at a minimum, and is happier
with Windows95.  There are two versions, one in English and one in German.
This is a review of the English version, for which the documentation
booklet is not yet written; I have read most of the German documentation,
but, attempting to save disk space, I did not install the German version of
the program on my computer.

	I am writing this review as an ordinary working phonologist with
phonetician pretensions.  I have not delved into the depths of digital
signal processing, and take such things as the details of FFT, hamming
windows and so on as given.  As is the case with many of my colleagues of
my approximate age, I think original Kay spectrographs (the steam-driven
ones that occasionally belch fire) still make the most visually convincing
spectrograms, although, of course, only digital displays with mice make
sense for actually measuring things.  And, of course, digital spectrographs
work MUCH faster.

	There are basically two parts to the program.  One, SPEECHLAB, is a
acoustic phonetics analysis package that uses the Windows sound interface.
The other is WinSAL (Speech Analysis under Windows), a multimedia,
interactive program to teach elementary phonetics.  It is possible,
however, to call up SPEECHLAB from inside the teaching program.

    As with most Windows-friendly programs, it was easy to install, and guides
you step by step through the process.  It requires a Windows compatible
sound board.  I installed it both on my home computer (Dell P-166 with 32
Meg of RAM, Dell's version of a Soundblaster card) and on my office
computer (Zenon Pentium 75, also with 32 meg. and a ProAudio Spectrum 16).
It works just fine with both without any fuss.  When installed it takes up
1.5 meg on my hard drive, and runs the spectrograph package off the hard
drive, but requires the CD-ROM to run WINSAL.

	I'll begin with the speech analysis package, Speechlab.  I have had
experience with three other packages--two PC-based and one Mac flavored.  I
primarily use CSRE, a complex and sophisticated DOS-based program that
permits waveform editing, calculates spectrograms, performs pitch
extraction, tracks formants, and includes both a synthesis program and an
experiment generator.  It is MUCH more sophisticated than this program, but
requires expert knowledge, since all parameters are definable (hamming vs.
hanning filters, different flavors of FFT and so on).

	I have also played with two others--WinCECIL, which is distributed as
freeware by SIL, and Signalyze, which is a commercial product that runs on
the Macintosh platform.  WinCecil is easy to use, but produces very grainy
spectrograms (they look like blown up newspaper photos), while the
interface for editing (zooming in on pieces of a waveform, choosing varying
displays) is very counterintuitive.  Signalyze is the industry standard for
Macintoshes, and it is excellent--it will do everything, but it is not
cheap.  I have not had experience with the current Kay product, which is
much more expensive than any of these options.

	So, where does this program fit?  It is relatively quick, but not as
flexible as the two 'professional' packages (CSRE, Signalyze).  The program
opens with a display that it calls an 'oscillogram' (a waveform). It
produces colored spectrograms, which are pretty to look at, but I feel
produce too much visual noise.  One can control the frequency range for
spectrograms by typing in minimum and maximum numbers.  Choosing between
128, 256, 512 and 1024 points gives wider and narrower band displays, and
one can choose the upper and lower limits of dB display.  One cannot change
away from a black background with a heat-based scale in which hotter is
higher.  Thus formants are essentially red on black.

	The signal editing is clever--dragging the left mouse button with shift
zooms in on the region so marked, with the left side of the display set to
zero, so time domains can be measured.  Control plus left button restores
the display to the preceding state.  There doesn't seem to be a way to go
directly from a triply zoomed piece of waveform back to the original,
entire sample.

    There is also an FFT display, which produces a thin spectral slice.  This
is useful for precise measurements of formants.  In the case of both
spectrum and FFT display there is an instantaneous readout of values in the
bottom left-hand part of the screen giving whatever numbers correspond to
the position of the cursor.

	The program lacks a printing facility, and the only files that can be
saved are the .wav files that are recorded by the program (or virtually any
other Windows sound program).

	As a teaching device it would be excellent.  It is fast enough to be
useful in the classroom, or for students learning to read spectrograms, or
to do mini-experiments, but the spectrographic display is probably
insufficient for serious scientific research.  The waveform editor,
however, should suffice for serious work in the time domain--measuring
vowel length, for example, or VOT.  One fairly serious drawback, however,
is that it does not permit simultaneous display of both waveform and
spectrogram, which is often useful in making decisions about segmentation.

     The Speechlab program, on the other hand, has no drawbacks I can find.  It
is an excellent piece of work.  It teaches elementary articulatory and
acoustic phonetics, along with basic IPA symbols.  It is limited to the
sounds of English and German (no pharyngeals or clicks included), but it is
a fully hypertextual, multimedia approach.  Once the student has gone
beyond the basic notion of point and manner of articulation, and the idea
of airflow through the vocal tract (including some cute animation of air
molecules bouncing around in the mouth, nose, and being directed in streams
during the production of fricatives) s/he can work through a phonetic chart
in which one can choose to see a video of a person saying the sound,
synchronized with audio output, or switch to the details of the spectrogram
of the sound, or look at a sagittal section.  The detailed coverage of the
basics of acoustics (sound waves, cycles, frequency amplitude, even
summation of frequencies to produce complex waves) is particularly well
done because of the animation of little balls (representing molecules)
sliding along wavy paths.

    The disk also comes with a phonetics database and search engine containing
over 4000 references to articles in Journal of Phonetics, Phonetica, Folia
Phoniatrica and Language and Speech, with issues from 1948 to 1993.  The
program is in German, and I have no indication whether they plan to produce
an English translation.  Entries can be modified, to add keywords, for
example.

     Information on the program, a downloadable demo copy and order forms can
be found at <http://www.media-enterprise.de/winsal/winsal_e.htm>.  Price is
listed as 499 DM (with a 300 DM reduction for students with proof of
status).  For the currency-impaired, this is approximately US$295.00 (as of
March 5).  It can be ordered over the Internet with a credit card.

The Reviewer

Geoff Nathan is an Associate Professor of Linguistics at Southern Illinois
University at Carbondale.  He has published on phonological theory and the
phonetics of second language acquisition, and is interested in functional
phonology and its relationship to formal theories.  Publications include
'On second-language acquisition of voiced stops,' Journal of Phonetics
(1987) 15.4:313-322., 'How the Phoneme Inventory Gets its Shape--Cognitive
Grammar's View of Phonological Systems,' Rivista di Linguistica (1995)
6.2:275-287 and   'Naturalness in Phonology,' (with Bernhard Hurch)
Sprachtypologie und Universalienforschung (1996) 49.3:231-245.
Geoffrey S. Nathan
Department of Linguistics
Southern Illinois University at Carbondale,
Carbondale, IL, 62901 USA
Phone:  +618 453-3421 (Office)   FAX +618 453-6527
+618 549-0106 (Home)

---------------------------------------------------------------------------
LINGUIST List: Vol-8-334