ROBERT SIEGEL, host:
(Soundbite of music)
Question in the form of an answer: This music is from a hugely popular television game show. Answer, in the form of a question: What is jeopardy? We play the music in honor of the news that IBM is designing an advanced question answering system - I've interviewed people who fit that description - to compete with humans on "Jeopardy."
David Ferruci is the project director and an IBM artificial intelligence researcher in Yorktown Heights, New York. He joins us from his office.
Mr. DAVID FERRUCI (Project Director and Artificial Intelligence Researcher, IBM): Hi.
SIEGEL: How close are you to a computerized "Jeopardy" competitor?
Mr. FERRUCI: Well, I think we're pretty close. We've been working on it for a couple of years at this point. There's a lot more to do, but we think that will be, you know, competitive against the grand champions.
SIEGEL: How does playing "Jeopardy" comparing difficulty with, say, playing chess, which would seem to be a much more predictable universe that you're trying to work within.
Mr. FERRUCI: Well, you're exactly right. I mean, chess is a lot more predictable, there's a finite number of moves, albeit, you know, very many. With something like "Jeopardy," it's just a huge, huge domain of possibilities. And, you know, it presents a very different sort of challenge for the computer.
SIEGEL: What kind of a database will your system - which I gather is named Watson for Thomas Watson of IBM - what kind of a database will it draw on for its answers?
Mr. FERRUCI: Well, it's a large database comprised mostly of natural language corpora. So, different sorts of books, reference material, many, many texts, news information, such a wide variety of documents. And the computer program is challenged to understand the question that's being asked, as well as the content that it's read and stored in its memories.
SIEGEL: I was thinking what is "Jeopardy" in the introduction. And I imagine the category in which that would be one of the clues. I made up the category: par for the course. And the other clues would be: unreasonable fear, board game based on one from India, topping for spaghetti and Athenian landmark.
Mr. FERRUCI: Wow, great clue. I don't know the answer. Maybe Watson might be able to get it.
(Soundbite of laughter)
SIEGEL: Well, you see, but there's a P-A-R in each of those answers, par.
Mr. FERRUCI: That's right.
SIEGEL: So, would Watson be able just to find words with - figure out that's where the joke is, or does he need Alex Trebek to say, as you can imagine, all of these answers will have P-A-R in them?
Mr. FERRUCI: In that particular case, it would not make the difference. But as it starts to see other answers revealed, it will look for patterns there as well. So, it may discover, you know, maybe starting out with not being confident enough to answer one of those questions. It may look at the answers coming out. And then, for example, say, okay, I'm starting to see a pattern in these other answers and then figure it out.
SIEGEL: So, it could take on board that hears paranoia from one and parmesan cheese from the other and it can figure out Parcheesi based on that?
Mr. FERRUCI: Yeah, it can, correct. There's some expectation that there would be repeated patterns in the categories or the answers so that it would be able to figure it out. So, it will not rely on understanding new procedures and clues that Alex would explain. It would only rely on having trained over the existing data and over the language and its databases.
SIEGEL: I mean, the challenge here, given the idea that, you know, these are kind of trivia questions, but they're drawn from every kind of information we gather throughout life, is really to equip an, as you said, answering system with a phenomenal amount of information.
Mr. FERRUCI: Well, that's right. I mean, I think one of the classic challenges in this whole field of automatic question answering is this notion of open domain or broad knowledge where, you know, you're not told ahead of time, you know, it's just going to be about this topic. So, that's clearly one of the important aspects of the particularly "Jeopardy" challenge.
I mean, that, the high levels of precision that are required, the speed and, of course, the confidence. And dealing with that breadth of domain is an interesting aspect of this. And as, I think, driven our - have generalized essentially our solution to this problem.
SIEGEL: Tim Jennings…
Mr. FERRUCI: Right.
SIEGEL: …biggest winner ever in "Jeopardy."
Mr. FERRUCI: Yes.
SIEGEL: When are you going to be ready to take him head to head?
Mr. FERRUCI: Well, we're not ready right now. There's work to do. But we think we'll get there.
(Soundbite of laughter)
SIEGEL: Were you thinking months or were you thinking years here?
Mr. FERRUCI: Oh, we - you know, I hope that we'll be ready to do something like that within a year, maybe a little bit longer, but hopefully within a year.
SIEGEL: Well, we look forward to it.
Mr. FERRUCI: Appreciate that.
SIEGEL: Thanks a lot for talking with us.
Mr. FERRUCI: Thank you.
SIEGEL: It's David Ferruci of IBM, who's the project director in developing Watson, an advanced question answering system that he says is perhaps a year away from competing with human contestants on "Jeopardy."
NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.