The Future Of Computer-Generated Sound Effects

Despite advances in computer-generated image technology, most sound effects for films and virtual reality experiences are still recorded manually. Doug James, associate professor of computer science at Cornell University, is working on technology that would make sounds just as easy to render with computers.

Copyright © 2011 NPR. For personal, noncommercial use only. See Terms of Use. For other uses, prior permission required.

IRA FLATOW, host: You're listening to SCIENCE FRIDAY. I'm Ira Flatow.

Let's see - quick quiz - if you can recognize these sounds.

(SOUNDBITE OF VARIOUS SOUND EFFECTS)

FLATOW: All right. Did you get them? First one was a fire, right? Second one, a bubbling brook. And the third one is something breaking, right? Easy? None of these sounds, though, are real. None. They've all been created by a computer. You know, we're used to seeing computer-generated images in film and virtual reality. It's so good that a lot of time you're not even aware it's computer-generated. But what about the sound effects that go with them? You might be surprised to learn that a lot of sound effects we hear in movies are still stuck in the age of the hand-drawn animation that we don't see anymore either.

My next guest created those sounds you've heard. He's working on computer models to take sound effects into the future for use in movies or videogames. Someday you might be able to animate any kind of sound from scratch right there on your laptop. No pre-recorded sounds required. That's what we'll be talking about.

Our number is 1-800-989-8255. Would you like some of these pre-recorded sounds? Would you like to make some of them your own? How do they do that? Well, that's what we're hoping that Doug James tells us. He's an associate professor in the department of computer science at Cornell University, and he joins us from the CBC studios in Vancouver. Welcome to SCIENCE FRIDAY.

DOUG JAMES: Hi, Ira. Good to be here.

FLATOW: How easy is it to make those sounds from scratch, or how hard is it?

(SOUNDBITE OF LAUGHTER)

JAMES: I guess it's harder for some people than others, perhaps. But I think a lot of the sounds are pretty tricky to make, especially one of the first things we often do when we're trying to simulate a new sound phenomena is just try to understand where the sound even comes from. For example, where does sound come from when you pour a glass of water? Where is that sound coming from?

FLATOW: Mm-hmm. Let's - I'm going to play those sounds individually again one at a time, and let's talk about how they were made and how difficult they were made. Let's go to sound number one.

JAMES: OK.

(SOUNDBITE OF BUBBLING WATER)

FLATOW: That is so real. Bubbling water, right?

JAMES: Yes.

FLATOW: So real-sounding. How do you craft - how do you tell a computer to make the sound of bubbling water?

JAMES: Well, the first thing you should probably do is simulate some water. So in that case, you know, my student and I, we simulated a flow of liquid. And then as the liquid starts moving along, it can trap air bubbles inside it. And so when these air bubbles are trapped, these suckers vibrate really fast. They're really stiff, and that's the reason that the water can actually make all those audible sounds. It's actually really hard for water to make much sound at all without bubbles. And so once it traps these bubbles, they vibrate, and these vibrations make the surface of the fluid vibrate, which makes the air vibrate, which produces the sound. So what we do...

FLATOW: Yeah, go ahead. Sorry.

JAMES: So what we do is we simulate this flow with the bubbles, and we figure out how all the bubbles - there can be thousands of them - vibrate the fluid and produce sound. And we have a fast algorithm to evaluate that in parallel.

FLATOW: So you're telling me that, mathematically, you create formulas that mimic how water flows, or...

JAMES: That's right.

FLATOW: And then the - so you have to figure out what vibration these formulas will make to make it sound, as(ph) we're listening to it, to make it sound like water and then sound like bubbling water.

JAMES: That's right. It's a kind of computational realism where we're trying to paint everything, you know, just as it is.

FLATOW: So you have to know, then, a lot about water.

JAMES: Well, you have to know some things. Water, I mean, is very complicated. There's all kinds of obscure physics that it can have in different circumstances. But for sound, the most important thing is really how it traps the air bubbles.

FLATOW: Now, when you record sound of bubbling water, which I imagine you did a lot of in this experimentation...

JAMES: Yeah.

FLATOW: ...how close does it compare to the...

JAMES: Well, we - yeah.

FLATOW: ...to what you come up with?

JAMES: Well, there's always - the real sounds are much more complicated than we can currently do. I mean, this is essentially the first paper that tries to simulate those things. But for simple cases like a water drop falling into a steady pool, we can capture the frequency shift of the bubbles to some reasonable accuracy, where, you know, people have difficulty telling if it's real or not real, as you demonstrated. As the flows get more and more complicated, then it's difficult. And for very large simulations like, you know, pouring - you know, dumping a car into a swimming pool or something, it would be quite challenging to simulate those using our current methods, but we'll get there.

FLATOW: How much - you know, and that was my next point, is getting there. How much computing power does this require to make some of the more difficult ones?

JAMES: Well, I would say that the computing power is on par with the cost of doing most of the physical simulations in the first place without the sound. And often, the sound is more parallel. So for some things, you may find that - well, I should be more specific. But for some of the fluid simulations, it may take a few hours for a larger simulation. And then to add the sound to it, which can be done in parallel, again, take a few more hours. But there's all kinds of speed, accuracy tradeoffs that we can make to make things essentially real-time. The question is: What kind of quality are we willing to live with?

FLATOW: You know, my question is, as someone who studied sound for 40 years in the radio business, is why do this? Why can't you just take some bubbling noises out of my fish tank and record them and...

JAMES: You can do that.

FLATOW: ...instead of spending hours and hours of putting this together on a computer simulation?

JAMES: Well, just - every time there's water moving around, we'll just play that bubbling fish tank sound, and we're done.

(SOUNDBITE OF LAUGHTER)

JAMES: Right? There you go. Problem solved.

(SOUNDBITE OF LAUGHTER)

FLATOW: Is there anything to actually allow us to - you can be massage these bubbles, I imagine, you know, massage the sound and make it sound like different bubbles. And now you can - you can insert them into computer games if you'd like, I imagine.

JAMES: That's right, yup. So, I mean, yeah. I mean, there's - it's a physical model. It's essentially like a physics-based symphony, where you have multiple tracks for every little bubble in town, and you have all these different sound sources. And so one of the challenges with sound synthesis is essentially, you're simulating a virtual world and you have all these different sound sources that you will need to resolve. And water bubbles are just one set of players.

FLATOW: Let's go to Jamie in Medford, Oregon. Hi, Jamie. Welcome to SCIENCE FRIDAY.

JAMIE: Hi, Ira. Well, first of all, this is not serious, but I'm going to pretend like it is. I am livid. This man is going to put a lot of Foley artists out of work. Foley artists are people that - that post-production is what they do. If someone doesn't like the way a certain car crash or a saloon fight doesn't work out, they do it all in real time, and they do it with the movie going. And they make up the sounds, so it sounds a lot better when you watch it in the theater than it would when they actually did it at the same time.

FLATOW: Yeah. They put the footsteps in, the hoof beats, all those kinds of things.

JAMIE: Yes. Sound effects people, yes.

FLATOW: So he's not really, Jamie...

JAMIE: So what is he going to say about that?

(SOUNDBITE OF LAUGHTER)

FLATOW: Unserious...

JAMIE: I'll take the answer off the air.

FLATOW: Thank you. Unseriously, he's asking that question. Do you have any pangs about that?

JAMES: Well, I think - I mean, I know a number of people in the sound design industry. I mean, these people are some of the most creative people I've ever met. I don't think you can put them out of business with this technique. I mean, I think, essentially, this is a method currently which is not nearly good enough to replace the kind of real sounds that you can generate in a lot of these offline, post-production scenarios, where you can take quite a bit of time to get it just right.

And in some sense, what the computer models do and, you know, is basically give you an automatic way to have the computer generate sound for a computer simulator world. Most of the sounds that people are Foleying, for example, are live-action films. So, you know, like, if you look back to what Jim Foley did for, say, "Spartacus" or something, getting the, you know, swords and shields clanging, those were all live-action shots. There are no, you know, 3D models of anything, right?

So it's only essentially for computer-simulated virtual worlds and in content in movies, which, of course, is pretty much every film in the top 10 these days, but - is this where this is really appropriate.

FLATOW: You know, using the visual model again in movies, our eyes are tricked by the camera, right? I mean, we're - there are a lot of frames going by, and we see them as a continuous picture. And so there are those blank spaces that we fill in. Do - does our ears - do our ears fill in that, too, with your audio? Do you - can you play shortcuts? Can you make a shortcut and trick our ears into thinking we're hearing something, because we expect to hear that kind of sound?

JAMES: Right. Yeah. There's the old adage that, you know, see-a-dog, hear-a-dog kind of a, you know, thing. But I think, you know, sound fills in the spaces between the frames. It gives us sense of space off-frame. And there's, you know, people like - I saw a talk by Ben Burtt, who is this famous sound designer for "Star Wars" and many other films like "Wall-E." And he - they have such a deep sense of what sound can do to your sense of what you see, and so on.

So, I mean, it - the important thing for animation, for simulated content is that we basically only see, essentially, 30 frames per second or 60 frames per second. And there's a lot that can go on temporally between that. And sound tells you, you know, it tells you that things chatter when they fall. It tells you if an object is hollow or if it's solid. And all these little cues are really critical for filling in all this detail and giving you this richness, which current video games, you know, in between the, you know, gunfire and grunting (unintelligible) are essentially missing.

(SOUNDBITE OF LAUGHTER)

FLATOW: Let's listen to one - well, let's listen to, again, to one of the sounds we heard before.

(SOUNDBITE OF SHATTERING)

FLATOW: That was a breaking of a - it sounded to me like ceramic, something - a cup or a saucer or something like that.

JAMES: Yeah. One of them was a plate, I believe, and probably a piggybank. They got clipped off a bit at the end. But those are ceramic and glass objects.

FLATOW: And as you say, you have to craft it to, not only the object being ceramic, but the surface that it falls onto, right?

JAMES: That's right, yeah.

FLATOW: If it's a hard or soft or wood or stone or something like that.

JAMES: Yeah. So when you have objects colliding, you know, you say you always hear both objects, essentially, right? If you dropped something, a penny on the ground - a penny may actually make very little sound, you know. You can take two pennies and bang them against each other. They're not very loud. But if you drop them on a piece of glass or something, then the glass makes sound. So every collision has two opposites producing sound.

FLATOW: You know, being in a sound business my whole life, I cannot help but hear every sound around me. Are you the same way? Do you listen, unavoidably, to all the sounds? (unintelligible)...

JAMES: Yeah. It can be difficult, sometimes. Yeah. Yeah. Especially when, you know, when we're working on specific projects, if you're focused on, like, contact sounds or, you know, bubble sounds, I hear, like, you know, bubbles everywhere. And it's kind of, you know, I try to guess the pitch of them roughly or something. So...

FLATOW: Right. Yeah. Good luck to you. It's fascinating. We'll be listening from more of your sounds.

JAMES: That's great.

FLATOW: Thank you, Doug. Doug James is an associate professor of computer science at Cornell University and published paper about all these sounds.

Copyright © 2011 NPR. All rights reserved. No quotes from the materials contained herein may be used in any media without attribution to NPR. This transcript is provided for personal, noncommercial use only, pursuant to our Terms of Use. Any other use requires NPR's prior permission. Visit our permissions page for further information.

NPR transcripts are created on a rush deadline by a contractor for NPR, and accuracy and availability may vary. This text may not be in its final form and may be updated or revised in the future. Please be aware that the authoritative record of NPR's programming is the audio.

Comments

 

Please keep your community civil. All comments must follow the NPR.org Community rules and terms of use, and will be moderated prior to posting. NPR reserves the right to use the comments we receive, in whole or in part, and to use the commenter's name and location, in any medium. See also the Terms of Use, Privacy Policy and Community FAQ.