Speech-Recognition Technology Advances
RENEE MONTAGNE, host:
Monday our business report focuses on technology and today speech recognition. It's been called the Holy Grail of computer technology, the ability of a computer to understand what you say and act on it. The technology has actually been around since the 1960s when computer scientists were trying to mimic the complexities of human speech. Nowadays, a lot of people are talking to computers in the office, cars and more. NPR's Lisa Chow(ph) reports.
LISA CHOW reporting:
Speech recognition technology offers a lot of promise.
(Soundbite of telephone)
JULIE: Hi. I'm Julie, Amtrak's automated agent.
CHOW: But the reality can be frustrating.
JULIE: To get schedule and price information, say, `Schedules.'
JULIE: Sorry. I didn't get that.
JULIE: What city are you departing from?
CHOW: San Antonio.
JULIE: I think you asked for Ontario. Is that correct?
CHOW: No. San Antonio.
JULIE: I think you asked for Hinton, West Virginia.
CHOW: David Pogue actually likes talking to a machine. Pogue is a technology columnist for The New York Times. He writes his columns and books, answers e-mails and maintains a daily Web blog by dictating to his computer.
Mr. DAVID POGUE: (Dictating) Bit by bit--comma--Americans are buying these larger--comma--wide screen--comma--more expensive TV screens--period.
CHOW: Nearly 10 years ago, Pogue developed a severe wrist condition and found it difficult to type, so he looked into speech recognition software.
Mr. POGUE: In those days, (speaking slowly) you--had--to--talk--like--this, one word at a time. Today, you just talk like this--dash--you put in your own punctuation--comma--of course--comma--but you literally talk as I'm talking now--comma--and everything gets written down with close to 100 percent accuracy--period.
CHOW: The ultimate goal of speech recognition is to make the keyboard obsolete and develop a machine that you can talk to the way you talk to a person. In other words, a machine that understands everything you say. But that's tricky.
Mr. STEPHEN SPRINGER (ScanSoft): The human brain and the capacity for speech is just such an amazing capacity that we underestimate how hard it is.
CHOW: Stephen Springer works at ScanSoft, a company based in Massachusetts and the maker of the program, Dragon NaturallySpeaking.
Mr. SPRINGER: Just hearing these words and not only understanding them but drawing the implications and drawing pictures in your mind, it's fabulous and machines aren't going to be doing that for 50 years. What I think they can do is apply themselves in a practical way towards getting the mundane stuff out of your life.
CHOW: Here's how it works. You speak, the computer matches your sound against known words in the system. Then it uses probability analysis to make a logical guess as to which word you meant to say. For example, if you say `cat' and the next word sounds like `litter' or `sitter,' it's more likely to be `litter' because the previous word was `cat.' David Pogue says it's pretty accurate most of the time.
Mr. POGUE: You know, I said `an enormous number of variations' and it typed out, `an enormous number of very Asians.' So they can be amusing and yet you can't blame it because they do sound so much alike.
CHOW: Accuracy has gotten good enough that doctors are starting to use speech recognition software to dictate their diagnoses. But the hottest growth market is expected to be in cars.
Mr. JEFF FOLEY (ScanSoft): Play song "Pinball Wizard."
(Soundbite of "Pinball Wizard")
THE WHO: (Singing) Ever since I was a young boy, I played the silver ball.
CHOW: Developers at ScanSoft, Jeff Foley and Jonna Schuyler, are working on a program that lets users talk to their digital music players.
Ms. JONNA SCHUYLER (ScanSoft): Play album Sarah McLachlan "Fumbling Towards Ecstasy."
Computer Voice: Sorry. Try again.
Ms. SCHUYLER: Play "Fumbling Towards Ecstasy."
(Soundbite of "Fumbling Towards Ecstasy")
Ms. SARAH McLACHLAN: (Singing) In a world...
CHOW: Software companies, the federal government and universities are all investing in speech recognition technology. And over the coming years, they'll be competing to see who gets the last word. Lisa Chow, NPR News, Washington.
(Soundbite of "Fumbling Towards Ecstasy")
Ms. McLACHLAN: (Singing) In a world...
NPR transcripts are created on a rush deadline by an NPR contractor. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.