Could a dystopian future where NPR hosts are replaced by soulless robots soon be upon us?
Text-to-speech technology has been in the news lately thanks to the latest version of Amazon's electronic book reader.
It's called the Kindle 2, and it has a feature that reads the text out loud.
That worries some authors — including Roy Blount Jr., president of the Authors Guild and a panelist on NPR's Wait Wait ... Don't Tell Me. In an op-ed in The New York Times, Blount said he thinks the Kindle 2 could eventually take a bite out of the audio book market.
The systems have become more lifelike in the past decade due to a shift in how text-to-speech systems are engineered, according to Andy Aaron, a speech researcher at IBM. He says researchers have given up on having computers synthesize voices from scratch and are getting humans involved.
"We'll audition lots and lots of voice actors, pick one we like, and we'll record them reading untold amounts of sentences over a period of maybe a month," he says. "We'll take those sentences, chop them up into individual pieces called phonemes, and build a library of that person's voice."
But Aaron says Blount and everyone at NPR can stop worrying about robots replacing humans anytime soon.
"I think it's an amazing technology, but even so, I don't think we're anywhere near reading a novel out loud in a meaningful way," he says. "It requires a deep text understanding, and the technology isn't even close to that right now — and I don't see that happening five years from now, either."
Even so, the technology to give synthetic actors different emotions is improving. To demonstrate, Aaron plays a female text-to-speech voice reading "These cookies are delicious" in a monotone -– followed by "These cookies are delicious!" in an excited tone.
"We brought the actor into the studio and told her to read 1,000 sentences that were recorded in an upbeat voice," he explains.