FW: A Secretary in Your PC

Lane laneonline at yahoo.com
Fri Jul 21 20:43:01 UTC 2006


New York Times
Like Having a Secretary in Your PC 
by David Pogue
July 20, 2006

TESTING, testing, one two three. Is this thing on? 

Well, I’ll be darned. It’s really on and it’s really
working. I’m wearing a headset, talking, and my PC is
writing down everything I say in Microsoft Word. I’m
speaking at full speed, perfectly normally except that
I’m pronouncing the punctuation (comma), like this
(period). 

Let’s try something a little tougher. Pyridoxine
hydrochloride. Antagonistic Lilliputians.
Infinitesimal zithers. 

Hm! Not bad. 

Oh, hi, honey. Did you get to the bank before it
closed? Oh, hold on, let me turn off the mike.
Wouldn’t want our conversation to wind up in my
column!

O.K., back again. The software I’m using is Dragon
NaturallySpeaking 9.0 (www.nuance.com), the latest
version of the best-selling speech-recognition
software for Windows. This software, which made its
debut Tuesday, is remarkable for two reasons.

Reason 1: You don’t have to train this software.
That’s when you have to read aloud a canned piece of
prose that it displays on the screen — a standard
ritual that has begun the speech-recognition adventure
for thousands of people.

I can remember, in the early days, having to read 45
minutes’ worth of these scripts for the software’s
benefit. But each successive version of
NaturallySpeaking has required less training time; in
Version 8, five minutes was all it took.

And now they’ve topped that: NatSpeak 9 requires no
training at all.

I gave it a test. After a fresh installation of the
software, I opened a random page in a book and read a
1,000-word passage — without doing any training. 

The software got 11 words wrong, which means it got
98.9 percent of the passage correct. Some of those
errors were forgivable, like when it heard “typology”
instead of “topology.” 

But Nuance says that you’ll get even better accuracy
if you do read one of the training scripts, so I tried
that, too. I trained the software by reading its
“Alice in Wonderland” excerpt. This time, when I read
the same 1,000 words from my book, only six errors
popped up. That’s 99.4 percent correct.

The best part is that these are the lowest accuracy
rates you’ll get, because the software gets smarter
the more you use it — or, rather, the more you correct
its errors. 

You do this entirely by voice. You say, “correct
‘typology,’ ” for example; beneath that word on the
screen, a numbered menu of alternate transcriptions
pops up. You see that alternate 1 is “topology,” for
example, so you say “choose 1.” The software instantly
corrects the word, learns from its mistake and
deposits your blinking insertion point back at the
point where you stopped dictating, ready for more.

Over time, therefore, the accuracy improves. When I
tried the same 1,000-word excerpt after importing my
time-polished voice files from Version 8, I got 99.6
percent accuracy. That’s four words wrong out of a
thousand — including, of course, “topology.”

For this reason, it doesn’t much matter whether or not
you skip the initial training; the accuracy of the two
approaches will eventually converge toward 100
percent. 

NatSpeak 9 is remarkable for a second reason, too:
it’s a new version containing very little new. 

Yes, they’ve eliminated the training requirement. And
yes, the new NatSpeak is 20 percent more accurate than
before if you do the initial training. Then again,
what’s a 20 percent improvement in a program that’s
already 99.4 percent accurate — 99.5? That’s maybe one
less error every 1,000 words.

(Nuance has done some clever engineering to wring
these additional drops of accuracy out of the program.
For example, the program has always used context to
determine a word’s identity, taking into account the
two or three words on either side of it to
distinguish, say, “bear” from “bare.” The company says
that Version 9 scans an even greater swath of the
surrounding words.)

But the rest of the changes are minor. The
top-of-the-screen toolbar has shed the squared-off
Windows 3.1 look in favor of a more rounded Windows
Vista look. You can now use certain Bluetooth wireless
headsets for dictation, although Nuance has found only
two so far that put the microphone close enough to
your mouth to get clear sound. A new toolbar indicator
lets you know when you’re in a “select and say”
program like Word — that is, a program where you can
highlight, manipulate and format any text you see on
the screen using voice commands. 

At least Nuance hasn’t gone the way of so many
software companies, piling on features and complexity
in hopes of winning your upgrade dollars. For the
second straight revision, the company has preferred to
nip and tuck, making careful and selective
improvements.

Now, Nuance isn’t the only game in speech-recognition
town. Microsoft says that Windows Vista, when it makes
its debut next year, will come with built-in dictation
software. 

Nuance claims not to be worried, pointing out that
Vista will understand only English. NatSpeak, on the
other hand, is available in French, Italian, German,
Spanish, Dutch, Japanese, British English and “World
English,” which can handle South African, Southeast
Asian and Australian accents. 

NatSpeak is also available in a range of versions for
the American market, including medical and legal
incarnations. Mere mortals will probably want to
consider either the Standard version ($100) or the
Preferred version ($200), each of which comes with a
headset. Both offer the same accuracy. 

The Preferred edition, however, offers several shiny
bells and whistles. One of them is transcription from
a digital pocket voice recorder. This approach doesn’t
provide the same accuracy as a headset, and it
requires what today is considered an excruciating
amount of training reading: at least 15 minutes. But
it does free you from dictating at the computer.

The Preferred perk is voice macros, where you teach it
to type one thing when you say another. For example,
you can say “forget it” and have the software spit
out, “Thank you so much for your inquiry.
Unfortunately, after much consideration, we regret
that we must decline your application at this time.”

There’s also a $900 version called Professional, which
offers, among other advanced features, complete
control over your PC by voice; it can even set in
motion elaborate multi-step automated tasks. 

NatSpeak also runs beautifully on the Macintosh. The
setup is a bit involved: you need a recent Intel-based
Mac, Apple’s free Boot Camp utility, a copy of Windows
XP, and a U.S.B. adapter on your headset. And you have
to restart the Mac in Windows each time you want to
use NatSpeak. But if you can look past all that fine
print, NatSpeak on Macintosh is extremely fast and
accurate. 

If that sounds like too much effort, there is a
Macintosh-only alternative: iListen ($130 with
headset). Version 1.7, newly adapted for Intel Macs,
offers better accuracy and a shorter training time
than previous versions, though nothing like the
sophistication or accuracy of NatSpeak. After 30
minutes of training, the program made 42 mistakes in
my 1,000-word book excerpt, which the company says is
better than average. 

As for NaturallySpeaking: if you’re already using
Version 8, it’s probably not worth upgrading to
Version 9. Most people will find the changes to be too
few and too subtle.

But if you’re among the thousands who have abandoned
dictation software in the past, it’s a different
story. Version 9 is a stronger argument than ever that
for anyone who can’t or doesn’t like to type,
dictation software is ready for prime time; the state
of this art has attained nearly “Star Trek” polish.

Excuse me — what, honey? 

O.K., I’m just finishing up here; I’ll be right down.
Let me just turn my mike off.

E-mail: Pogue at nytimes.com



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



More information about the Lgpolicy-list mailing list