How Does The Brain Decode Speech? What happens to words after the ear picks them up? Neuroscientist Sophie Scott of University College London discusses the latest theories of speech perception, from how the brain recognizes a familiar voice to how it adjusts to each speaker's unique pitch and accent.
NPR logo

How Does The Brain Decode Speech?

  • Download
  • <iframe src="" width="100%" height="290" frameborder="0" scrolling="no" title="NPR embedded audio player">
  • Transcript
How Does The Brain Decode Speech?

How Does The Brain Decode Speech?

  • Download
  • <iframe src="" width="100%" height="290" frameborder="0" scrolling="no" title="NPR embedded audio player">
  • Transcript


This is SCIENCE FRIDAY from NPR News. I'm Ira Flatow. A little bit later, a look at the microbes in your belly button. But remember first those TV shows, that TV show "Name That Tune," where contestants would listen to just a few notes of a song and ring in with the right answer? I can name that tune in three notes. Right, remember that?

Well, how does the brain do that? How do you recognize someone's voice, for example, the second you hear it over the phone, almost instantaneously, or like my voice now?

We've got a pretty good handle on how the sound of voice gets to your ear, but what happens in the tiny electrical circuits of your brain to make you understand the show? Some answers to that question were published this week in a paper in the journal Nature Neuroscience, and here to talk about it is Sophie Scott, professor at the Institute of Cognitive Neuroscience at the University College of London in the U.K. She's on the phone with us from London. Welcome to SCIENCE FRIDAY, Dr. Scott.

Dr. SOPHIE SCOTT (University College, London): Hello, hi.

FLATOW: Hi, how are you?

Dr. SCOTT: I'm fine, thank you, fine.

FLATOW: Let's talk about what happens. We know the sound goes into my ear, it gets converted into electrical signals.

Dr. SCOTT: Yeah, absolutely. Sound itself is vibrating of air molecules. So what you're - first thing your ear has to do is turn that mechanical vibration into an electrical signal, and that's what your ear is doing. That's what the outside of your - you know, the outside of your ear is actually focusing the sound in and then transferring via structures called hair cells the movements of the air molecules into an electrical signal, which is sent up to your brain.

FLATOW: And how does it get to the brain?

Dr. SCOTT: Well, it actually goes up what's called the ascending auditory pathway, and what that means is there's lots of little relay stations which do quite extensive processing on the signal produced by the sound, and what a lot of that is actually doing is cleaning up the sound. It's getting rid of echoes. It's sorting out sounds that have come from the same areas in space such that by the time it projects up to the cortex, which is the kind of grey, overlying mantle of the brain, it's where all the sort of important processing goes on in our brains, you have a sort of a representation of the sound which can then be processed for its identity and its properties.

FLATOW: But I remember from my days in electrical engineering, when I was studying that, that you could take sound apart into its component frequencies. Does our ear do the same thing there?

Dr. SCOTT: It seems to. One of the things that you can see when sound is first being represented in what's called primary auditory cortex, when it's coming up to the cortical level of the brain, it's actually being represented in terms of its frequencies. So you find cells that respond to sounds of increasingly higher frequency or increasingly lower frequency, just going up and down the scale, just like, say, notes on a piano keyboard.

FLATOW: And is that what happens when it recognizes my voice versus someone else's voice?

Dr. SCOTT: That seems to be slightly more processing. So what we find is as we move away from these primary auditory areas, you find brain areas in humans which become selectively more interested in particular high-level properties of the sound. So as you move forward down what's called the temporal lobe, where we have auditory cortex in humans, and the right side of the brain, you start to find brain areas that are really interested in voices, things that could be someone speaking, and within that we find brain areas which seem to be very important for recognizing who is speaking. This seems to be the first point in the brain where speaker identity is being processed.

FLATOW: But another thing that happens when we hear something is we sort of point our head, right? We locate where the sound is coming from.

Dr. SCOTT: Absolutely.

FLATOW: Does the brain process that the same way?

Dr. SCOTT: That seems to be a very basic process of orienting to things in your environment. So sound is very good for saying, hey, something's going on here, pay attention to me, and then you turn your head towards it because you're actually - we actually use the direction our head is pointing to help us work out where sounds are, and we are good at locating sounds best when they are orienting towards them because, you know, literally that's the direction in which our ears are pointing.

So that's actually helping us pay attention to the sound, and it's also helping us locate the sound in space.

FLATOW: The paper that you wrote, published this week in Nature Neuroscience, what did you - what information did you add to this whole idea of what happens to sounds in our brain?

Dr. SCOTT: Well, what that paper was trying to do was to bring together evidence that we have just of generally how primate brains deal with sound and trying to see if we can actually use that to help us understand how human brains deal with hearing other people talking, because kind of classically, people would say, well, we can't learn a great deal from looking at non-human primates about human language because they don't have language. We are the only species that has spoken language.

FLATOW: And what did you find out?

Dr. SCOTT: Well, what we found, and this was a review paper, trying to look across a number of different experimental studies over the past decade, is that actually there are quite a lot of commonalities in terms of how the sound is getting into our brains that's very similar to non-human primates and humans.

So although we have got - a lot of properties of our brain are very different for humans than for non-human primates, it seems that the basic wiring for how we do early perceptual processing of the things we hear is actually very similar.

FLATOW: And can we apply what we've learned to people?

Dr. SCOTT: It seems that we can. So what we're finding in humans, as in non-human monkeys, for non-human primates, that there are brain areas which are very interested in communication calls, and of course for humans that's things like speech and emotional vocalizations, and that that's actually - that actually has a similar pattern of processing, as you'd find in the brain of a non-human primate, listening to the communication calls made by other monkeys.

So monkeys are very verbal. They're very vocal. They use lots of communication calls, and it seems that one of the things about primate brains in general is that they are kind of cued up to do this, what we call hierarchical processing. They take the auditory information and they're refining it to kind of get to a higher-level representation, so the meaning of a call, if you like.

FLATOW: So then would an ape, then, have a same - would the animal brain have the same basic wiring necessary to decode speech that we have?

Dr. SCOTT: Well, it certainly would have - I wouldn't want to go as far as to say it could be exactly the same. I think the basic wiring system is the same, so the way the information is getting in.

Now, of course how it's being processed along the way could be different, and human speech is not like the communication calls of other animals. It is much, much, much more complex. It is actually comfortably the most complex sound we encounter in nature. So it's bound to be probably more complex than how it's being processed, but if you like the general route is still the same. And then of course it's being processed very differently once you've got the words into the brain. Then of course we can engage an entire language system to help us understand speech, and that's probably where the analogy starts to break down.

FLATOW: Well, how much overlap is there between speech perception and speech production?

Dr. SCOTT: Interestingly, quite a lot. So what we find in human adults is that the speech we learn, the language we learn to acquire as a child has an enormous influence on the sounds that we can hear and produce as adults.

So if we as adults encounter a language which uses a particular sound contrast that's not in our own language, it can be quite difficult for us to learn to understand. So the classic example there is the sounds ruh and luh. In English, it's two different sounds, so the words red and lead are two different words. Now in Japanese, ruh and luh are two versions of the same sound. So the word red and the word lead actually don't sound like two different words, and it can be difficult for Japanese listeners, therefore, to actually be able to hear that difference, to hear it and to produce it.

FLATOW: So then speech recognition and speech production are related. So in a way, I would guess, that as I'm speaking to you I'm also listening to myself speak at the same time.

Dr. SCOTT: Absolutely. And that's actually got some quite interesting properties to it because one of the things that we're finding is that when you speak, your brain responds differently to the sound of your own voice than it does to the sound of other people talking. It's actually suppressed in activation in auditory areas when you hear yourself talking.

So it's as if you're actually turning off the brain areas you would listen to somebody else's voice when it's your voice coming through, and this might be just how your brain is telling itself that this is you talking - don't worry, this is me - or it could also be its assistance for trying to monitor the sounds of your speech. So some people argue this is how you're kind of checking that what you're saying sounds right.

FLATOW: So does that mean I'm not listening to the other person when I'm speaking, if I'm turning that brain off?

Dr. SCOTT: That probably means that you shouldn't talk at the same time as somebody else, but it certainly means that you never - it seems to be one of the ways in which you never hear your own voice like the way you hear other people. Everyone's always a bit horrified and surprised when they hear their own voice on the tape recorder, and one of the reasons probably is that our brains are never actually processing it that way when you normally hear it.

FLATOW: So is there any sort of similar thing when you learn to play an instrument? You know your fingers are doing something.

Dr. SCOTT: You would have to predict that that would be very similar, and certainly you find it for other things. So if you touch your arm, you get a different response in your brain, a much lesser response in your brain than if somebody else was to touch your arm. It's the old thing, you know, you can't tickle yourself. So generally things, actions on the world which are produced by you, you have a different neural response to than if it's being produced by somebody else. And of course it's very important to know if things have been produced by you or produced by someone else.

FLATOW: You know, when you're on the piano and you fool around and you say that's not the right note, and you fish around for the right note, and you find it, is that what something - is your brain helping you do that there by recognizing the sound?

Dr. SCOTT: Absolutely, and interestingly, what we seem to be finding, and I'm talking about this at a very general level, is that we get very parallel systems for processing musical properties and for processing sort of the linguistic properties of what we hear. So if you like, to put it very crudely, the left side of the brain is very good at spoken language, and the right side of the brain is very good at dealing with musical pitch information, changes in pitch information.

FLATOW: So how about if that is one of our senses, is there a similarity in how vision might be working too?

Dr. SCOTT: Well, in a way a lot of what we're doing, in looking at how the auditory is instantiated in primates is kind of tugging at the coat-sleeves of what's been going on with vision because visual areas have been described in much more detail since the 1970s.

It's taken us a long time to catch up, but it does seem that at one level you can start to see these same patterns, very sort of crudely, different streams of processing, different ways that visual information is being processed. So when you see something in the world, part of your brain is processing what is that thing, and part of your brain is processing how would I interact with that thing? How can I pick it up? How will I, you know, how big is it relatively to my hand? How would I move towards it?

And it seems that very crudely, what we're seeing is something quite similar in the auditory areas and processing speech. When you hear somebody talking, part of what you're doing is actually processing the meaning of what they're saying, and part of what you're doing is processing a kind of sensory motor representation, (unintelligible) sort of simulate how you yourself would say that. So you process sounds both as a sound and as an action, and that seems to be quite similar at one level with the visual system.

FLATOW: You know, I'm reminded as you speak about sound about Oliver Sacks' writings about musicophilia and people who are just terrific at playing musical instruments, and parts of the brain that might be dedicated to sound like that.

Dr. SCOTT: There are two basic things going on. You can describe brain areas which are very important for processing music. So for example, patients who have had temporal-lobe legions on the right can have great difficulty processing pitch change, and they can find it very difficult to enjoy music as a consequence.

But you can also find studies that have looked at kind of the expertise elements of that, and some studies have reported that the brains of musicians are, of really excellent musicians, are actually different in terms of their overall anatomy. They have differences in the structure of their auditory areas that may be because they've practiced a great deal at playing a musical instrument, but there's also suggestion that there might be something that kind of - has made them predisposed to be good at playing music.

FLATOW: Dr. Scott, I want to thank you for taking time to be with us today.

Dr. SCOTT: It's been a pleasure. Thank you very much.

FLATOW: Have a great weekend. Sophie Scott, professor at the Institute of Cognitive Neuroscience at University College, London. We're going to take a break and change gears, come back and talk about something totally different, about - well, it's bacteria on your skin, about how many there are. You'd be very surprised. Stay with us. We'll be right back after this break.

Copyright © 2009 NPR. All rights reserved. Visit our website terms of use and permissions pages at for further information.

NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.