How The X-Box Kinect Tracks Your Moves
How The X-Box Kinect Tracks Your Moves
Kinect uses depth sensors, cameras and microphones to track the movements of players, and it's surprisingly good at weeding out distractions. Ira Flatow and guests discuss the development of the gaming technology — and how movement can influence players' moods.
Shannon Loftis, studio manager, Microsoft Game Studios, Good Science Studio, Redmond, Washington
Alex Kipman, director of incubation, Xbox, Microsoft, Redmond, Washington
Katherine Isbister, professor, computer science and digital media, New York University's Polytechnic Institute, Brooklyn, New York
IRA FLATOW, Host:
You're listening to SCIENCE FRIDAY. I'm Ira Flatow.
Up next, technology of a different kind. I don't know if you've had a chance yet to play with an Xbox Kinect, but it's pretty amazing. One of those moments where you think, you know, the future has arrived because you're just standing there in front of a big screen and moving your body to play the games and there's no controller. There's nothing for your thumb. There is nothing for you to hold onto in your hand. There's no hammering on those little plastic buttons.
In one of the games, you can learn new dance moves and Kinect tracks your hips and your hands and your feet and tells you if you're really hitting those moves or if you're not. And then, in another game, you speed through an obstacle course. You're jumping and ducking and sliding out of the way of walls and bars. But how does Kinect know where you are without holding onto any of these toys or stuff like that? How does it track your body motion fast enough to make that avatar on the screen do exactly what you're doing right there in your living room or your basement?
And not only that, could we use all this motion-sensing technology for other things? Think about it. Maybe it will improve the way we interact with our computers at work, so now maybe you're standing in front of your computer "Minority Report"-style. You know, your hands are, like, up in the air. And just by waiving your arms, you're pulling documents, you're moving them around on your desktop, maybe you can point at them, like a little finger and they open up. All kinds of stuff. You read your e-mail that way. No need to touch your mouse anymore, right? How close are we to any of these ideas?
Maybe you've got some ideas you'd like to share with us for computing of the future. Our number, 1-800-989-8255. 1-800-989-TALK. You can tweet us @scifri. We're going to talk about this future and what the Kinect box - how it works. Let me introduce my guest. Katherine Isbister is a professor of computer science and digital media at New York University's Polytechnic Institute in Brooklyn. She's here in our New York studios. Welcome...
KATHERINE ISBISTER: Thank you.
FLATOW: ...to SCIENCE FRIDAY. You're welcome. Shannon Loftis is the studio manager at Microsoft Game Studios and Good Science Studio in Redmond, Washington. She joins us at Microsoft. Welcome to SCIENCE FRIDAY, Ms. Loftis.
SHANNON LOFTIS: Thank you.
FLATOW: You're welcome. Alex Kipman is director of incubation for Xbox at Microsoft in Redmond. He's also joins us from that studio. Welcome to SCIENCE FRIDAY.
ALEX KIPMAN: Thank you for having us.
FLATOW: You're welcome. Let me ask you, Shannon. You game designers, you're thinking about joysticks, A-B-X-Y, how do you break free of all of that and design a game based on body movements? What do you have to give up and what do you gain?
LOFTIS: It's mostly what you gain. When you're designing a game for a controller-driven experience - think about it in terms of inputs and outputs. So when somebody presses a button, they expect to see a character reaction on screen. Now, what that does is it limits the experience to the number of character reactions that you can program into the game.
LOFTIS: Without a button input, it's limited only by what the human body can do. So what designing for Kinect really does is it creates an infinity where there was a set of limited states before.
FLATOW: Mm-hmm. And now you're unlimited in what you can do.
LOFTIS: Limited to only to what the human body can do and what we can imagine with it.
FLATOW: Well, let's talk about the technology of how - how does it keep track of where all our body parts are without having something that we're holding on to? Alex?
KIPMAN: As you talk - yeah, absolutely. As you talked about, what we were trying to do is bring something new to the Xbox 360. Our objective was to really transform it from this old world where we had to understand technology and really usher forward us into this world where technology disappears and it fundamentally understands us. Right?
In our world, it's about stepping in front of the sensor. The sensor not only tracks your full body, head to toe, but it also knows who you are. It knows the difference between you, your family and your loved ones as well as it understands your human speech and understands how we'd like to say that if you can see it, you can say it.
For the human body tracking, we had to re-think how to approach development for something of this nature. There is a reason why this level of technology or this level of innovation hasn't really existed before. And that reason is, because if you think about it, traditional heuristic-based programming is a world of digital land. It's zeros and ones. It's yes and nos. It's trues and falses.
The real world, the world of humans, the world we live in is an analog world. It's a world where it's not about yes and no, it's about maybe. It's not about black and white. It's about gray. If you think about it, to understand and track humans - let's just think about joints on the human body. You have several different joints. Now multiply that number by the degrees of freedom that each one of those joints has. Multiply that by the different proportions of the human body, from the kid to the adult to the slim person to the not so slim person.
FLATOW: My hair is hurting.
KIPMAN: Multiply that - Yeah. Before you know it, you're going to get a search space that's going to be around 10 to the 23 in terms of space. That's 10 with 23 zeros next to it, which, as you can imagine, if I'm doing your traditional heuristic-based IFNL programming, that's, you know, the population of Earth-worth of programmers working for the rest of their lives and you still can't do it, which is why this has remained in the space of science fiction. So...
FLATOW: Right. Wow. I'm thinking about Ava Gaudreau(ph) for some reason, but go ahead. But how, in technical terms, how is - and let's get technical here. What is going on in that little Xbox device that's able to track all this stuff?
KIPMAN: Right. So we had to transition from this old world to the new world. The new world is a world of machine learning. It's a world where you are not writing in the sensor what it sees. You're teaching it how it can perceive the world. Kinect has set of eyes and a set of ears. The set of eyes allow us to see the room, understand both visually and acoustically what's going on and that goes to the 360, to the Xbox 360 where there is the equivalent of the Kinect brain.
Now, for just the human tracking part, let's focus on that, it's a series of sophisticated algorithms that will range in nature from computer vision to machine learning, to imaging science, to a series of other ones. The key innovation is in the machine learning. And the way that you can think about that working, it works similarly to the human brain.
If you think about, you know, when a baby's born and you show this baby a lion and a person and you ask the baby, if it could speak, can you tell them apart? The baby would not be able to do it. Fast forward in time and ask that same baby that same question, you would have instantaneously the ability to discern the difference between a lion and a person. Why? Because it has historical data. It has learned. It has burnt the pathways in his or her brain about being able to discern that. Now, show that same baby a male and a female. It won't be able to do it. Fast forward in time, you'll have no problems.
FLATOW: It's learning. It's learned it.
KIPMAN: So it's learned it.
FLATOW: Yeah, Katherine Isbister you've been doing studies on one of the competitors to the Xbox and the Kinect, on Wii and how it makes people feel when you play. Tell us about that because I'm sure this is the same thing that's going to be going on with Kinect.
ISBISTER: Yeah, well, we got really interested in the fact that some of the Wii games looked like they were more fun than others for players. And we knew about this effect called the physical feedback hypothesis from social psychology and essentially...
FLATOW: Physical feedback hypothesis.
ISBISTER: Yeah. Essentially, the idea is, it's basically why your mom told you to stand up straight. Because if you stand up straight, you start to feel like you're confident and then you become more confident. So the idea was, well, hey, maybe these Wii games actually are putting people into these really jolly, silly, happy kinds of wonderfully upbeat states by forcing them to run around and jump around as if they feel happy, so they start to feel happy.
So in our lab, we've been building these research games that allow us to compare keyboard control versus Wii-mote control and we've actually found that this is the case. And not only that, but it brings people closer together so they feel a greater sense of social connectedness after they play around with these movement-based games.
FLATOW: Because they like the - so the movement-based makes them feel happier, moving around like that.
FLATOW: Huh. Shannon Loftis, does that go into your game planning when you're making games? I know you're making family-oriented games here.
LOFTIS: Absolutely. The games are - they're family-oriented, but they're not limited to families. They're really for anybody who wants to have fun.
One of the benefits of Kinect is you get up, you're off the couch, you use your body to play the games in completely intuitive ways. I think Dance Central is actually a great example of a super positive feedback loop. I'm standing up - most people feel a little bit embarrassed about dancing in front of others, at least I do.
FLATOW: Katherine is shaking her head also so we can all agree on that.
LOFTIS: And I stand up - yeah. I stand up and I start playing. And rather than showing me myself dancing, I'm looking at a very cool character moving beautifully on the screen and yet I'm getting feedback about whether or not I'm matching the movements of that character. And the better my feedback gets, the more confident I get and the better I become as a dancer.
FLATOW: Hmm. It's interesting. Because we all create avatars of ourselves on these things, but we don't want to - and tell me if this is true because I feel this way, Katherine and Shannon - we don't want to put a real picture of ourselves up there. We don't want to look at ourselves. We'd rather create something that's better looking in the avatar world.
LOFTIS: Well, if you think about it, we're always controlling our self- presentation anyway. We're always trying to put our best foot forward. And giving somebody a little control over how they craft their avatar allows them to customize the person they want to be for others and also for themselves. So it makes a lot of sense that, given the choice, we would want to augment ourselves with these virtual characters and make ourselves even cooler in our own minds then maybe we are in reality.
LOFTIS: And yet the Kinect gaming experience is also sort of the ultimate personalization because every game is played the way that you want to play it. You can play aggressively, you can play strategically and you can play interpersonally.
FLATOW: Mm-hmm. Alex, can the Kinect actually identify individual users and maybe modify the game to counter strategies, you know, between players? I mean, if you - let's say you have a game that you play, tennis, and you know that some player has a strategy of always charging the net, will the Kinect be able to, you know, someday maybe - maybe it does it now - understand that strategy and suggest other strategy?
KIPMAN: Yeah, absolutely. So if you think about what Kinect brings to the table, it really brings a new palette, a new set of paint colors and paintbrushes around being to identify who you are and understand what your profile looks like, being able to track your full movement, your head-to-toe movement, and use your voice.
That palette, those paint colors and paintbrushes, get used by our creative game designers, both within Microsoft Game Studios as well as across the entire ecosystem. And in a way, the thing that gets the - everybody that I've spoken to really excited about this new palette is that it allows them to tell brand- new stories. At the end of the day, everybody here is a storyteller. And what Kinect allows you to do is tell brand-new stories that haven't been told before.
So if you think about the fusion of being able to fundamentally understand humans and how you couple that with voice and identity, knowing who you are, what that allows you to do is create extremely personalized experiences that become significantly more emotional and immersive.
FLATOW: Hmm. 1-800-989-8255. Zach(ph) in Tulsa. Hi, Zach. Welcome to SCIENCE FRIDAY.
ZACH: Hi, Ira. Thank you.
FLATOW: You're welcome. Go ahead.
ZACH: I was wondering if anyone was aware of research with perhaps something like Kinect as it would impact the hard-of-hearing community, given that we have the capacity for a computer to read someone giving sign language and then reinterpret back to, perhaps on a teleconference, voice?
FLATOW: Interesting use for that.
KIPMAN: Yeah. There's actually a lot of research in that space. It's not something we support with Kinect today. But in the journey, the way to think about it is we started the journey this November 4th with the launch of Kinect. But as we move forward, you'll start seeing a lot of these experiences start happening.
And what I said in the beginning, I meant, right? This is a shift, monumental shift, where we move the entire computer industry from this old world, where we have to understand technology, into this new world, where technology disappears and it starts more fundamentally understanding us. Now, that world starts with Xbox, with Kinect in the living room across gaming and entertainment, but it's something that over time, that journey is something that's a lot more pervasive than just that.
FLATOW: We're talking about new frontier in gaming this hour on SCIENCE FRIDAY from NPR. I'm Ira Flatow. Katherine, do you think this - is he selling this too hard? Or you think it's going to change the whole world that way? Or is he right about it?
ISBISTER: No, I think he's really right, actually. I think that it's long overdue, really. We've started to have these sensors and cameras everywhere, and it's more a matter of the technologist figuring out...
FLATOW: What, what...
ISBISTER: Sorry - figuring out how to design for these new technologies well, to create interesting, new experiences for people. I mean, I like to think about things like email. I mean, I spend a good portion of my day sorting through email.
ISBISTER: I feel like I need reading glasses to see everything that's on the screen. And I'd much rather have sorting email be like doing tai chi in the morning.
ISBISTER: And I don't see any reason why it can't be with the things like the Kinect. Once they're in the living room, I could do that, instead of sitting at my laptop first thing. It would be great.
FLATOW: Yeah. What about taking it to other - you know, just to your PC or your laptop, Katherine or Alex or anybody else who wants to answer it? But, Shannon, what about, as I said, doing this sort of "Minority Report"-type of thing, where you're now standing in front of your PC or you're sitting there and you don't have to touch the mouse anymore and you can move things around on your desk because it's watching you?
LOFTIS: Yeah, absolutely. I mean, I think we're going to see blended systems that use things like the camera-based system Kinect has, with multi- touch screens and so forth. And I actually think maybe even my daughter's generation will think it's just preposterous that we ever used such strange things as keyboards and mice.
(SOUNDBITE OF LAUGHTER)
FLATOW: Is that on the drawing board, Shannon?
LOFTIS: We're playing around with quite a number of different user interfaces. I mean, we're also absolutely correct. This is the very beginning.
FLATOW: Mm-hmm. And...
KIPMAN: And there are several babies that were born during the development of the Kinect product cycle. And it's been quite interesting, because we've been working really hard on this. So we spent lots of hours in the labs. The babies tend to come to the labs in which point they becomes testers of the software. And it's quite entertaining to see a lot of these babies today that are now, you know, two-, three-year-olds. They get in front of the TV, and with or without Kinect, they tend to wave to it, which is the gesture that we do to initiate the system.
I tend to hearken back to, you know, my "Star Trek" days, where in one of their last movies, you know, they come back to Earth of the '90s, and Scotty sees a computer and he picks - and he speaks to it. He says, computer, do blah. And then it doesn't work. And he says, oh, I've seen one of these in a museum one day. And then he picks the mouse and speaks to the mouse, and says, computer, do this. And obviously, it doesn't work again.
But this is the world that we are starting to usher. This is what I mean when I say we're taking science fiction and making it into a science fact. I do think that generations to come, and Kinect begins that journey, is a world that is much more natural. It's a world that people interact with technology, as they should, in a more natural way. And technology, by disappearing, allows us to connect people and have, essentially, technology be the lubricant for conversation. But the point is inter-social behaviors.
FLATOW: Let me get a quick phone call, and before we leave. Doug(ph) in Louisville. Hi, Doug.
DOUG: Hi. Good afternoon.
FLATOW: Hi there.
DOUG: I see this as a technology that could be used for someone to work remotely, perhaps at home. Let's say that they're in front of a monitor with a camera at the workplace, maybe boxes are coming off an assembly line and a robot is there at the place. They could use their arm movements to grab these boxes, no matter how much they weigh, and move them where they need to be while at home or wherever else.
FLATOW: I've seen a movie like this. Thanks for the call, Doug. Possibilities for that, Alex, you think?
KIPMAN: Yeah. Absolutely. What Kinect does is it tracks 20 joints in your body, and it does this in real time, all right? We do these calculations for everybody that's in the scene at 30 frames per second. As soon as you do that, you're capable of putting this on any other body. That could be a real body, robot arms, or that could be a digital imaginary body, which is what we do with the avatars inside our games today.
FLATOW: All right. Stay with us. We're going to come back and talk lots more about the future of computing. Our number, 1-800-989-8255. The gaming world this hour in SCIENCE FRIDAY. Don't go away. We'll be right back.
(SOUNDBITE OF MUSIC)
FLATOW: You're listening to SCIENCE FRIDAY. I'm Ira Flatow.
We're talking this hour about the Xbox Kinect, the technology behind the future of video game design, with my guests, Shannon Loftis and Alex Kipman of Microsoft, and Katherine Isbister, professor of computer science and digital media at New York University's Polytech Institute in Brooklyn.
KIPMAN: 1-800-989-8255. We're getting a lot of tweets in on this. Of course, folks are interested. But one of the tweets that we hear over and over again is about the guy - I'll read the tweet. Are you going to talk about the guy who hacked the Kinect, the prize that was offered for the hack and why Microsoft didn't release it for the PC version yet? Shannon, Alex (unintelligible)...
KIPMAN: Ira, I can I answer that one?
FLATOW: Yeah, go ahead.
KIPMAN: The first thing to talk about is Kinect was not actually hacked. Hacking would mean that someone got to our algorithms that sit on the side of the Xbox and was able to actually use them, which hasn't happened. Or it means that you put a device between the sensor and the Xbox for means of cheating, which also has not happened. That's what we call hacking, and that's why we have put a ton of work and effort to make sure it doesn't actually occur.
FLATOW: So what would you - okay.
KIPMAN: What has happened is someone wrote an open-source driver for PCs that essentially opens the USB connection, which we didn't protect by design, and reads the inputs from the sensor. The sensor again, as I talked earlier, has eyes and ears and that's a whole bunch of, you know, noise that someone needs to take and turn into signal. People...
FLATOW: So you left it open by design then? So you knew people could get into it.
KIPMAN: Yeah. Correct.
ISBISTER: And I just want to throw in that this is the kind of thing that's a dream for researchers like myself. I mean, I still haven't got an actually developer's kit for the Wii, but we use the open-source shareable inputs to the Wii-motes. And that's how we work with the Wii technology, and - so I was very heartened to see that the Kinect's actual hardware was going to be available soon for researchers anyway to put stuff together and test in the lab.
FLATOW: So you have no problem...
ISBISTER: As an...
FLATOW: ...with Microsoft, with the people using the open-source drivers then?
LOFTIS: As an experienced creator, I'm very excited to see that people are so inspired that it was less than a week after the Kinect came out before they had started creating and thinking about what they could do.
FLATOW: So no one is going to get in trouble?
KIPMAN: Nope. Absolutely not.
FLATOW: You heard it right from the mouth of Microsoft.
(SOUNDBITE OF LAUGHTER)
KIPMAN: And by the way, I...
ISBISTER: I'm really relieved.
KIPMAN: And I do want also address, semi-quickly, the academic point. We will, sooner rather than later - and we're already doing a lot of this - start continue to partner with academic places to make sure that this innovation does make it into academic circles, right? So we started this already with places like USC and other universities some time ago.
And now that the product has volume, we will start increasing that academic program, which we have through Microsoft Research, where at the end of the day, we're excited about this technology. This technology really allows us to do new things, and we wanted this palette to be available to...
KIPMAN: ...academics so that they can use the palette to create brand-new pictures we haven't seen before.
FLATOW: Let me ask a couple of questions that people have been asking, I'm wondering myself. First, I want to know what radiation you're using to scan us with. There's an infrared or is it sound? There's microwave. What is it that's taking all our joints under consideration?
KIPMAN: Yes. We do have a light source. The light source is in the near- infrared range.
KIPMAN: And what it does is it's not really scanning anything, it's a constant projector. What it does is it allows us to see the room in any ambient light conditions. So this is so the technology can disappear and you don't have to understand ambient light conditions. You can play in pitch dark rooms, total cave-like scenarios. You can play in brightly lit rooms.
Because we have our own light source, we see based on that light source so that our sensors can see in an ambient light condition. And so there's no scanning...
KIPMAN: ...it's more permeation of light so that we don't care about the ambient light to be able to perceive.
FLATOW: There you go. That's it. Another question I have, going in the opposite direction, how about a thought for using this technology to train people about things that they're doing incorrectly. Let's say, I want to learn yoga or I want to learn my tennis stroke better, is that - now your scanner is looking at my position, my yoga position and saying, you know, your back is arched a little too much or change that sort of thing.
Because you have real time, you know, why - you know, I do a little demonstration of my tennis service. Say, oh, your elbow is on the wrong spot. Could you make that kind of thing? And Katherine, do you think that's the kind of thing you might want to see going in the other direction, teaching us how to do something with our bodies?
ISBISTER: Absolutely. I think if you look at the self-help industry, they're really eager to make that work, doing things like biofeedback...
ISBISTER: ...and physical posture stuff.
ISBISTER: I think it would be great.
FLATOW: Shannon, anything in the works about something like this?
LOFTIS: So the great news, there are already products that are available on the market that do this. There's a launched game called "Your Shape: Fitness Evolved" that is 10 times better than going to a workout class, because if I go to a class, a yoga class, say, and there are 20 people there, I get maybe 1/20 of the instructor's attention, feedback on whether or not I'm doing things right. But because Kinect is continually tracking me and continually giving me feedback on my positions and my poses, I've become much better and much stronger at yoga.
The very first, earliest (unintelligible) demos that I did for Kinect, we had a professional volleyball player come in, and she was in rehab. And her immediate response was, ah, I wish I had this for physical therapy.
FLATOW: Are you going to encourage open source games to be made or other utilities to go with it?
KIPMAN: I would say the console market is a closed ecosystem.
KIPMAN: It's a closed ecosystem by design because we do need to control the experience. There's no installs. There's nothing ever that goes wrong. So it is a managed ecosystem. So on the console, that will remain this way. Although we do have, have had and will continue to have free tools that people can download - it's called XNA Studio - which allows you to create applications that you can deploy to the Xbox 360 that you can actually do. There's no Kinect support through XNA today, but that's something that we will support in the future.
FLATOW: Or now that the USB port is open in your PC to create open source games or tools that you can use.
KIPMAN: Yes, on...
KIPMAN: I would say on PC. People are already doing it today. I get YouTube videos on an hourly basis of people doing cool, neat, creative experiences based on using Kinect on PC.
FLATOW: What about - a lot of games - movement games today, they have us performing real world movements like bowling or throwing things like - as we've been talking. What about designing games with unusual movements that humans cannot or usually don't do? Is that too weird a kind of thing, you know, to think about?
LOFTIS: I think it's actually - it depends on the sort of experience that you want to provide to people. The combination of human body input and whatever you have coming out allows you to create a fantasy or an illusion or tell any story that you really want to tell.
A good example is "Kinect Adventures," which is the game that ships with the Kinect device and the Xbox 360. On "Kinect Adventures," in one of the activities, you're actually floating in outer space and popping bubbles in zero gravity. In another activity, you're standing up in a raft as you ride down a raging river. These are not real world scenarios, but they are sort of great stories to tell.
FLATOW: Katherine, any...
ISBISTER: Yeah. I think that also if you think about the emotional components of movement, I think about things like modern dance, or just how little children like to express themselves through how they move or skip around the room, there's a lot of leeway for what you might call abstracted movement games or activities where it's not necessarily mimicking a real world motion or action, but you're giving people an excuse to move in a way that changes how they feel that may be even more abstract than bubbles. You know what I mean? Just be little swishes of color and sound and...
ISBISTER: ...be all about the emotional experience that happens as a result.
FLATOW: And the game or whatever you're interacting detects your emotional state.
ISBISTER: Exactly. Yeah, we're actually working with a woman who's analyzing some video we did of people playing Wii games. And she feels like - she's a certified Laban movement analyst, which is a style of dance notation. And she's pretty sure that she can detect when people are frustrated with the movement they're doing from their posture. So we're trying to identify what those cues are because our thinking is, well, wouldn't that be great if we could feed that back into systems like this and detect when somebody's kind of had it with the game mechanic from their posture and how they actually execute the move.
FLATOW: Well, we'll see - there are all kinds of possibilities now. The game is out and available. Correct, Shannon?
LOFTIS: That's correct. It's launched all over the world by now, yeah.
FLATOW: And with the ports open on the PC, anything can happen. We'll keep track of what's happening. I want to thank both of you for taking time to be with us today. And also here in the studio, thank you for joining us.
KIPMAN: Thanks for having us.
FLATOW: You're welcome.
FLATOW: Katherine Isbister, professor of computer science and digital media at New York University's Polytech Institute in Brooklyn. Shannon Loftis, studio manager at Microsoft Game Studios and Good Science Studio in Redmond. Alex Kipman is director of incubation for Xbox at Microsoft in Redmond. And thank you again for taking time to be with us today.
NPR transcripts are created on a rush deadline by an NPR contractor. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.