ROBERT SIEGEL, Host:

Let's imagine that we had a database consisting of the telephone records of tens of millions of Americans. What could we learn from it? How could we find some useful needle in such a colossal haystack of data? We're going to ask Steven Bellovin, who's a computer science professor at Columbia University.

Welcome to the program, professor.

STEVEN BELLOVIN: Thank you.

SIEGEL: Let's assume that we don't know the content of any of those tens of millions of phone calls, just which number is calling or being called by which other numbers. What can we do with all that information?

BELLOVIN: This is a branch of intelligence called traffic analysis. It's been historically extremely valuable dating back at least to World War II, but let's start with the obvious. If you have a suspected terrorist, any party that this person calls you might want to investigate to see if they are also possibly involved in terrorism. You could look at the patterns of calls.

SIEGEL: Is this something that is useful as a preventive measure, or is it more once you find something out otherwise that you would figure out who might be associated with either your suspect or somebody who actually committed a terrorist act?

BELLOVIN: You could go with this both ways. Suppose you've got two parties, in cryptography we usually atalk bout Alice and Bob, and Alice is calling Bob once a week for five minutes, and suddenly she's calling Bob three times a day for half an hour. The pattern has suddenly changed. You might wonder if something has come up. If there's some operational planning in the works. There's plenty of historical evidence for that to be militarily effective going back to World War II, which is the best documented cases of that.

SIEGEL: But in these years long after World War II, Alice might very well have a throwaway cell phone, and Bob might have several different cell phones that he uses.

BELLOVIN: These are techniques to evade traffic analysis because the bad guys know all about this too. One of the things you can do is try to pick things out, pick out the patterns with data mining. Think of the calls you make from your cell phone, think if there's anybody else on the planet who calls the same set of numbers? Probably not. You call those people close to you, but you don't call yourself, they call you. There is technology out there to find, to identify a person through millions and billions of calling records even after they've changed their numbers.

SIEGEL: Let's continue with Alice and Bob for a moment. Alice has been identified as someone who traveled many times to a country that we associate with terrorists, let's say. What can we actually find out by going to a database of the kind that we've now heard reported?

BELLOVIN: What you're finding is hints. Who is Alice calling? You start wondering when you find Alice calling somebody else who's on your suspect list for some reason, and you just start building up more and more associations that way. It's, you're finding very thin strands, you get enough of them, you can form a strong rope. And the question is, how many strands does it take to make a rope?

SIEGEL: Is there a point at which chance and complexity simply gets the better of reason? That is that ultimately Alice and Zebulon are going to call the same person. They're both going to get harassed by the same catalog company that's calling them up every week, and you'll find connections that really don't mean anything--they just happen to be coincidental.

BELLOVIN: It certainly can happen. You know, this is not proof beyond a reasonable doubt; this is not courtroom evidence. It's the kind of thing that you've got to treat with great care. And quite apart from this chance connection you're talking about, you do have people actively trying to deceive you, making calls just for the purpose of creating false patterns, for example.

SIEGEL: Well, Professor Bellovin, thank you very much for talking with us today.

BELLOVIN: Quite welcome, glad to be here.

SIEGEL: That's Professor Steven Bellovin, who teaches computer science at Columbia University.

Copyright © 2006 NPR. All rights reserved. Visit our website terms of use and permissions pages at www.npr.org for further information.

NPR transcripts are created on a rush deadline by a contractor for NPR, and accuracy and availability may vary. This text may not be in its final form and may be updated or revised in the future. Please be aware that the authoritative record of NPR’s programming is the audio.

Comments

 

Please keep your community civil. All comments must follow the NPR.org Community rules and terms of use, and will be moderated prior to posting. NPR reserves the right to use the comments we receive, in whole or in part, and to use the commenter's name and location, in any medium. See also the Terms of Use, Privacy Policy and Community FAQ.