Kindle's New Voice Is Almost Human

  • Playlist
  • Download
  • Embed
    <iframe src="" width="100%" height="290" frameborder="0" scrolling="no" title="NPR embedded audio player">
  • Transcript

Could a dystopian future where NPR hosts are replaced by soulless robots soon be upon us?

Text-to-speech technology has been in the news lately thanks to the latest version of Amazon's electronic book reader.

It's called the Kindle 2, and it has a feature that reads the text out loud.

That worries some authors — including Roy Blount Jr., president of the Authors Guild and a panelist on NPR's Wait Wait ... Don't Tell Me. In an op-ed in The New York Times, Blount said he thinks the Kindle 2 could eventually take a bite out of the audio book market.

The systems have become more lifelike in the past decade due to a shift in how text-to-speech systems are engineered, according to Andy Aaron, a speech researcher at IBM. He says researchers have given up on having computers synthesize voices from scratch and are getting humans involved.

"We'll audition lots and lots of voice actors, pick one we like, and we'll record them reading untold amounts of sentences over a period of maybe a month," he says. "We'll take those sentences, chop them up into individual pieces called phonemes, and build a library of that person's voice."

But Aaron says Blount and everyone at NPR can stop worrying about robots replacing humans anytime soon.

"I think it's an amazing technology, but even so, I don't think we're anywhere near reading a novel out loud in a meaningful way," he says. "It requires a deep text understanding, and the technology isn't even close to that right now — and I don't see that happening five years from now, either."

Even so, the technology to give synthetic actors different emotions is improving. To demonstrate, Aaron plays a female text-to-speech voice reading "These cookies are delicious" in a monotone -– followed by "These cookies are delicious!" in an excited tone.

"We brought the actor into the studio and told her to read 1,000 sentences that were recorded in an upbeat voice," he explains.



Please keep your community civil. All comments must follow the Community rules and terms of use, and will be moderated prior to posting. NPR reserves the right to use the comments we receive, in whole or in part, and to use the commenter's name and location, in any medium. See also the Terms of Use, Privacy Policy and Community FAQ.

NPR thanks our sponsors

Become an NPR sponsor

Support comes from