IRA FLATOW, host:
If you're assuming that your dealings on the World Wide Web are private, think again. Last week, 650,000 AOL users found out the hard way when the company posted a three-month span of their search results online. The company said their data release was a mistake and removed it. It also said individual users couldn't be identified because the searchers weren't tagged with names or screen names, just anonymous ID numbers. But reporters at The New York Times were able to trace one user's search terms back to a woman in Georgia. So much for anonymity.
Well, if you're like me, you are now wondering just how much of your personal dealings on the Internet are open to the public. Of course we all assume, or should assume, that everything we do on the Internet should be treated like talking in an elevator: people are going to be listening. But there must be some way to keep some modicum of privacy, if nothing more than the privacy of our Web searches.
Joining us now is a privacy expert with some suggestions. If you'd like to talk to him, our number is 1-800-989-8255. 1-800-989-TALK. Kevin Bankston is a staff attorney at the Electronic Frontier Foundation in San Francisco. Welcome to SCIENCE FRIDAY.
Mr. KEVIN BANKSTON (Staff Attorney, Electronic Frontier Foundation): Thank you, Ira. And I just wanted to reassure your listeners that so long as they never search for anything even remotely embarrassing on the Internet, they've got nothing to worry about.
FLATOW: Well, I guess one person's embarrassing is another's treasure.
(Soundbite of laughter)
FLATOW: Tell us what you mean by that.
Mr. BANKSTON: Well, as you mentioned, a lot of people don't realize that search-engine companies do log your search queries. They have a record of what you searched for, even those most embarrassing late-night searches of whatever you want to, you know, look up on the Internet.
FLATOW: How do they know that? Give us a little bit of a insider's look on that.
Mr. BANKSTON: An insider's look at?
FLATOW: How they know where you're going, what you're doing, what you're searching for?
Mr. BANKSTON: Oh, okay. Well, all the search engines log your search queries tied to your Internet protocol address, which is sort of like your phone number on the Internet, as well as to cookies, which are little spy files that they plan on your computer so that they can recognize you each time you come back. And the cookies typically don't identify you directly but they do allow them to tie together your search history.
For example, in the AOL disclosure we have a bunch of logs of individual users and their history of search over a three-month period of time. And so even though the logs are not directly linked to their identity, like their screen name or their name or address or billing information, putting all the pieces together makes you identifiable - can make you identifiable. And it's worth noting that you mentioned - The New York Times had identified a woman - Ms. Thelma Arnold, 62-year-old widow in Lilburn, Georgia.
Washington Post in a story yesterday just identified another person, JoAnn Whitman, a 55-year-old retired grocery store worker in Colorado, who had accidentally cut and pasted a store receipt with her name and address into the search box. So sometimes, as in that example, you don't even need the full history of someone to figure out their identity. It may have been revealed in a single search, and we've actually seen quite a few of those in looking at that data - search strings that in one search identify a person and give their address and phone number and sometimes driver's license and Social Security number. Those are probably not the person who was searching, but they still give a lot of information about some individual.
FLATOW: Let's talk about then ways that we can prevent, or at least try to prevent, our searches from being traced. What are some of the dos and don'ts?
Mr. BANKSTON: Well, one of the first big rules is don't use search services from someone else that you get other services through that you've registered for. So, for example, if you're an AOL user - they know who you are when you connect to the Internet, and if you're searching using their default search option - for example, their software that you download - that is easily linkable to your real identity, your AOL screen name, et cetera.
Similarly, if you use both Yahoo Mail and Yahoo Search, they could link your search history to your e-mail account. And so the first big piece of advice is: segregate your search activity from the other companies you get services from. And do not log in if you are going to use those companies' services because you are identifying yourself when you do so.
So, for example, at Google, you can log in, create a Google account to get g-mail or to personalize your Web site or whatever, but if you're logged in they can link that account to your searches.
FLATOW: Hmm. Interesting. Okay, rule number two.
Mr. BANKSTON: Okay. Well, rule number two is make yourself untrackable. Keep them from being able to put together a history of your searches. And they way they can do this is through the accounts, like I mentioned. But also through cookies and also through your IP address.
So most browsers allow you at this point very fine-grain control of your cookies. And so ideally you would delete the cookies from your search provider after every use. That can seem like a lot of work. One of the easiest ways to accomplish that is most browsers have a checkbox in their preferences or their tools where you can say, delete cookies every time I close the browser.
Mr. BANKSTON: And so that way all of your searches are keyed to a different cookie. They'll have to place a new cookie in your computer each time you visit and so they can't link that history with one cookie.
FLATOW: But you may lose the passwords and things you've stored.
Mr. BANKSTON: That is true. Although, you know - again, with most browsers you can do pretty fine-grain controls. So, for example, you could have it automatically delete your Google cookies but not your Amazon cookies or whatever.
Mr. BANKSTON: So as a general security matter we recommend you learn your passwords. And so the other way they can link you, even if you're managing your cookies really well, is by IP address. Again, this is ort of like your phone number on the Internet, identifies what network you're coming from. If you're using your ISP's search service, they certainly know what your IP address is and your real identity.
Regardless, your ISP always knows what your IP address is. So, for example, if someone had a search log with an IP address attached, they could then go to the ISP with a subpoena and say, tell us who had this IP address at what time. And that way you could be linked. But even if they don't link it to your real identity, using IP address they can link your history over time. And again, by allowing all those puzzle pieces to be put together, that can identify you. Even if you don't put a lot of PII into any particular one search term, all the terms could be pulled together and identify you, as The New York Times and The Washington Post have demonstrated.
FLATOW: So how do you disguise your IP address then?
Mr. BANKSTON: Well, we've got a tool for that. EFF helped to fund a software tool called Tor. And what this is - the software folks call this a proxy service. This means that when you make a request over the Internet it looks like your request is coming from a different computer than it actually is, and so it's not your IP address that AOL or Google or whoever sees.
It's fairly easy to use software. There is a bit of a slowdown in your Internet service because you are routing through more computers than you would have. But this will mask your IP address. There's other anonymizing services. One's called Anonymizer at anonymizer.com.
But the Tor software, which is easy to use and install, is available at our Web site at tor.eff.org.
FLATOW: Mm-hmm. What are some of the mistakes that people put in their search window that they should never do?
Mr. BANKSTON: Well, that's a hard question to answer because in your introductory comments you said you should treat everything online as public. And that's actually a trend that we want to stop. We want people to be able to use search engines privately.
The one thing that's clear when you look at these logs - and anyone in your audience can do that, unfortunately, by going to a site like aolsearchdatabase.com, where some people have mirrored the data and made it searchable.
You can just click the randomizer a few times and look at a few logs, and you'll see people treat their search engine like a confidante. People are asking the search engines basically for advice on the most intimate parts of their lives, thereby revealing medical ailments, financial difficulties, sexual preferences. Pretty much anything and everything you could imagine. We don't want...
FLATOW: There were some searches even had...
Mr. BANKSTON: I'm sorry.
FLATOW: How to kill my spouse on there, there were some listings for.
Mr. BANKSTON: There was one like that, although that brings up a good point. I think that a lot of people could be misperceived or these search terms could be misconstrued. I saw another example where someone was searching for similar, you know, how do I kill someone, looking for pictures of dead people, stuff like that. But there were other searches along with those searches making very clear that this was someone researching for a crime novel they were trying to write.
You know, for example, just because you search for al-Qaida training camp, that doesn't mean you want to go to one. It may mean you're just trying to read up on the news.
FLATOW: Well, let me ask you that question. If you put something like that in there, are you going to show up in a database at Langley, at the CIA someplace?
Mr. BANKSTON: Well, you know, God only knows. I wouldn't rule out the possibility. I think that if the law were followed, the government would need a wiretap order, which is sort of like a super search warrant, in order to monitor your search activity. But it's much easier for them to subpoena these stored logs.
Which brings us to the lesson of if they didn't keep this stuff, there'd never be a worry of leaks or the government coming and knocking for the logs or whatever. They could simply say, we don't have these. Which raises the question: why are these companies keeping this data and should they?
They say, generally, platitudinously, that they store these logs to give better service to you. And certainly there is some legitimate need for at least some limited storage for a short period of time so they can debug their system, make sure their search engine gives good results and things like that.
But those kinds of rationales that, you know, this data is needed to maintain the service, kind of fall apart when you're talking about a decade of search logs, as is the case with Google, who kind of proudly keeps everything. They have logs of every search since they were running Google out of the dorm rooms at Stanford.
AOL, ironically, although their security is clearly not adequate, as demonstrated by this leak, is actually pretty good on privacy in terms of how long they store stuff. They typically don't store stuff more than 30 days. As far as MSN or Yahoo, they won't even say how long they store it.
Which brings us to another issue, which is we don't have any real clear idea of what these companies are doing with this data, how they're using it, how long they're storing it. Yet those are exactly the things that users need to know before they can decide which engine they want to use and whether they're willing to put sensitive stuff into that search box.
FLATOW: Are there no regulations that control the keeping of these records?
Mr. BANKSTON: No. There's no clear law applying to your search logs. There is a law that we think does apply, the Electronic Communications Privacy Act of 1986, which is the same law that protects the privacy of the e-mail you store with Yahoo or whomever. Which generally requires that if the government wants to secretly get at that e-mail, they need to get a search warrant. And the provider can't disclose that e-mail to any non government party without your clear consent.
We think this law applies to search terms as well, but the government in a brief earlier this year in a case about Google records made clear that it doesn't think the law applies, while the search engine themselves waffle on the question. Because if it does apply, that saves them a lot in compliance costs because it requires a search warrant and they don't have any civil litigants coming at them. But it also adds potential liabilities if they accidentally disclose stuff, as AOL did.
And so I think what's really needed - and I hope that this news will spur this action - is for Congress to clarify that this 20-year-old law does indeed give strong legal privacy protection to your search terms.
FLATOW: You're listening to TALK OF THE NATION: SCIENCE FRIDAY from NPR News. Talking with Kevin Bankston, who is at the Electronic Frontier Foundation in San Francisco. He's a staff attorney. Our number 1-800-989-8255 is our number. Let's go to Cal(ph) in Portland. Hi, Cal.
CAL (Caller): Hello. Good afternoon - or good morning still, depending on where you're at. First, I'd just like to plug eff.org. Fabulous organization. I'm a member. I run a Tor server at home in my basement, big supporter of the program. I think everyone should get online and learn how their liberties are threatened.
But I have a question. With all the back doors being built into stand-alone devices, communication devices in particular, what can we do about that? And in particular, Voice over IP. Are there any encryption tools available so that we can protect our private conversations when using Voice over IP? Seems like every other method of phone communication really isn't private anymore either. So that's my question, and I'll get my answer online. Thanks.
FLATOW: In fact, Skype, which has an encryption, it was broken recently, was it not?
Mr. BANKSTON: I don't know that it was broken. I think it's possible that it was. The problem with Skype, it is encrypted, but their code is not open source. That is it's not open to inspection by anyone else. So no one else can independently verify whether it's strong encryption or not.
Back to the caller, though. First off, thanks for the kind words about EFF. Second, in terms of back doors, that's a big question and problem in terms of, one, are vendors building their equipment secure enough? Two, are vendors building back doors into their equipment for use by governments?
As to question two, there is already a fight in Washington right now over a law called the Communications Assistance for Law Enforcement Act or CALEA, which required the phone companies in the '90s to build their networks to be tappable. But it didn't, by its terms, apply to the Internet. But now the FBI is pushing the FCC, and the FCC has issues regulations to the effect of the Internet is covered and now all these network vendors have to start building tappable equipment.
We're fighting that. But it does raise a big question about how secure is the equipment that my communications are going over. And as the caller kind of suggested, the only real protection against that possibility is the use of encryption tools. And not tools that are provided by your ISP or anyone else who might have the key, but using an encryption such that only you have the keys to your encrypted messages.
A very popular communications encryption tool is PGP - stands for Pretty Good Privacy - and you can check out their tools at pgp.com. Our proxy network software Tor - which I mentioned earlier and is available at tor.eff.org - also does provide encryption for most of your communications journey.
FLATOW: We're going to have to stop it there and pick it up again next time. I want to thank you for taking time to be with us.
Mr. BANKSTON: Oh, it was my pleasure.
FLATOW: You're welcome. Kevin Bankston, staff attorney at the Electronic Frontier Foundation in San Francisco.
Short break. And then up next, from being watched on the Internet to being watch at the airport. Screening airline passengers for suspicious behavior. We'll talk about it. Stay with us after this break.
I'm Ira Flatow. This is TALK OF THE NATION: SCIENCE FRIDAY from NPR News.
NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.