Search Engines Records and How They Can Be Used

Danny Sullivan, editor of SearchEngineWatch.com, discusses what kind of information search engines keep about users' searches.

Copyright © 2006 NPR. For personal, noncommercial use only. See Terms of Use. For other uses, prior permission required.

MICHELE NORRIS, Host:

So, what kind of information do search engines keep about all your searches. To find out, we turn to Danny Sullivan. He's the editor of SearchEngineWatch.com and Danny, before we begin, could you tell us exactly what Search Engine Watch does?

DANNY SULLIVAN: We track the search engine industry. We write about how search engines operate to educate both those who want to use them to locate information as well marketers who are trying to be found on them.

NORRIS: And so you'll be able to answer that question for us then. What kind of information do search engines keep about the people who use them?

SULLIVAN: To some degree, it depends on how you interact with them. If you simply go to Google, for example, each day and never signed up for any of their services, they will know that you came from a particular computer, a particular IP address, sort of your internet telephone number, and they may be able to, over time, know that you made certain kinds of searches. But they don't know who you are personally or anything like that.

If you sign up for one of their services, for example, they have a personalized search service, they'll start keeping a record of what you've searched for and that would be associated with your Google account which would be associated with your name and if you actually decided that you wanted to buy things like video from them as they sell, then that would be potentially associated with your credit card number and your actual address and they have a better idea of who you are in that way.

NORRIS: So are there differences between different search engines? Between Google and Yahoo, or Alta Vista, or any of, you know, there's so many search engines now, do they all keep the same kind of data?

SULLIVAN: They all seem to. I've not seen a breakdown in terms of say how long that they keep the data. So it may be that some of them are destroying some of the data after a certain period of time and other ones may be keeping it longer. Google's had the most attention in this area and they simply have seemed to have said we should go destroy the data and that may change.

It may be the people are going to start demanding that they do that, it may be that we'll have laws that will come in. But you do get some complications as well. The Google personalized search actually is a very nice service. It can learn the things that you've been to over time and it starts changing the kinds of search results you see, and I found that it actually can make your search even better. So, you as the searcher may want them to retain that data.

NORRIS: So you say they use this to perhaps tailor your searches. How else might they use that data?

SULLIVAN: The other way would be to deliver ads to you. If you take Yahoo! for example, if you've done a search on Yahoo, and then you go off and do other things, say you're over at Yahoo Movies and reading about some movie that's playing, but you had searched a day before for a new car. They might start showing you car ads because they know you are interested in cars based on your searching habits. So that's an example where they're targeting your search behavior maybe well after the fact of when you actually did that original search.

NORRIS: Well, before people learned about this Justice Department subpoena, do you think most people know that the search engines are compiling this kind of data.

SULLIVAN: No, I think it's still largely not aware among people and I don't know how much it will be a concern for some of the people as well. This type of thing exactly raises that kind of awareness because you don't really get a visibility of the search behavior that you're doing and that behavior needs to raise.

I mean people do need to be more aware of it because they have these conversations with search engines, very private conversations about things that they wouldn't tell a doctor, they might not tell a friend, they might not tell their spouses, but they are almost confessing to the search engine.

So you want to be aware of it and you want to be aware of it not just with what they are doing on the search engine, but even to the point that there's a record of what you're searching for on your own computer or even within your own browser. So, there are reasons why you may want to be protecting your privacy that way as well.

NORRIS: Danny Sullivan, thanks so much.

SULLIVAN: You're welcome.

BLOCK: Danny Sullivan is the editor of SearchEngineWatch.com.

Copyright © 2006 NPR. All rights reserved. No quotes from the materials contained herein may be used in any media without attribution to NPR. This transcript is provided for personal, noncommercial use only, pursuant to our Terms of Use. Any other use requires NPR's prior permission. Visit our permissions page for further information.

NPR transcripts are created on a rush deadline by a contractor for NPR, and accuracy and availability may vary. This text may not be in its final form and may be updated or revised in the future. Please be aware that the authoritative record of NPR's programming is the audio.

Google Records Subpoena Raises Privacy Fears

Google logo

hide captionThe Justice Department has subpoenaed records on billions of Web queries made by users of Google and other Internet search engines, raising new privacy concerns.

Torsten Silz/AFP/Getty Images

The Justice Department has requested records for millions of searches made on Google, AOL and other popular search engines in an effort to bolster its case for an online pornography law. The subpoena is for broad data on search habits, not personal information. But the request has raised alarms among industry observers and civil libertarians who wonder what kind of data search engines have about their users — and what other, more sensitive data the government may seek next.

Search Engines on Privacy

In their online privacy policies, the Web's four biggest search engines say that they automatically receive and record information on user searches, including browser type and language, computer IP addresses, unique cookie information, and the URL of the page requested.

In foreign countries, technology companies say they must follow the laws, regulations, customs and norms of the country in which they are operating. For example, in China, Yahoo and Google routinely exclude sensitive political or religious information from searches. Microsoft's MSN has also been accused of blocking Web logs based in China that contained material sensitive to the Chinese government.

Below are excerpts from search engine's privacy policies regarding the disclosure of information for legal purposes:

Google: "Google does comply with valid legal process, such as search warrants, court orders, or subpoenas seeking personal information. These same processes apply to all law-abiding companies. As has always been the case, the primary protections you have against intrusions by the government are the laws that apply to where you live."

Yahoo: "We respond to subpoenas, court orders, or legal process, or to establish or exercise our legal rights or defend against legal claims; We believe it is necessary to share information in order to investigate, prevent, or take action regarding illegal activities, suspected fraud, situations involving potential threats to the physical safety of any person, violations of Yahoo!'s terms of use, or as otherwise required by law."

MSN: "We may access and/or disclose your personal information if we believe such action is necessary to: (a) comply with the law or legal process served on Microsoft; (b) protect and defend the rights or property of Microsoft (including the enforcement of our agreements); or (c) act in urgent circumstances to protect the personal safety of users of Microsoft services or members of the public."

AOL: "The contents of your online communications, as well as other information about you as an AOL Network user, may be accessed and disclosed in response to legal process (for example, a court order, search warrant or subpoena); in other circumstances in which AOL believes the AOL Network is being used in the commission of a crime; when we have a good faith belief that there is an emergency that poses a threat to the safety of you or another person; or when necessary either to protect the rights or property of AOL, the AOL Network or its affiliated providers, or for us to render the service you have requested."

Google, Microsoft's MSN, Yahoo and AOL received subpoenas for a random sampling of millions of Internet addresses cataloged in their databases, as well as for records for potentially billions of searches made over a one-week period. Only Google refused to comply. The Justice Department wants to use the data to support its argument that Web-filtering software doesn't work.

"The reason they're asking for the data is that they want to be able to say, 'Look, this is how much porn is potentially reached online,'" says Danny Sullivan, editor of Search Engine Watch, an industry newsletter. "But next time, they might come in and ask for data that does contain personal information. That serves as a wake-up call for people."

A 'Honey Pot' of User Information

The request was part of the government's effort to uphold the Child Online Protection Act (COPA). The 1998 law requires online distributors of "material harmful to minors" to prevent minors from accessing the site. Civil liberties groups argued that COPA also restricts protected free speech. Courts have blocked the law from taking effect.

Subpoenas for the search information were issued last year. Representatives from MSN, AOL and Yahoo said their companies chose to comply only after ensuring that the information did not violate the privacy of their users. Google opted to fight the request. This week, the Justice Department asked a federal judge to force Google to hand over the information.

In refusing to comply with the subpoena, Google cited concerns over the privacy of its users and the protection of its trade secrets. The Electronic Frontier Foundation, a civil liberties group, applauds Google's defiance, but notes that the government's request brings new focus on the type of information that search engines collect on their users.

"All the search engines have created a honey pot of information about people and what they search for," says Kurt Opsahl, a staff attorney for the group. "It's a window into their personalities — what they want, what they dream about. This information gets stored, and that becomes very tempting."

What Search Engines Know About You

Search Engine Watch's Sullivan says that people might be surprised to learn how much information search engines store about their users. At a basic level, search engines retain a record of the Web sites users visit and the search terms they use. "Cookies" — text files that are embedded in a user's hard drive by a Web page server — help search engines keep a record of their customers' Web habits to personalize their searches and to deliver targeted advertising. Yahoo's cookie expires in June 2006. The cookie used by Google lasts until 2036.

Search engines that offer e-mail services — such as Yahoo Mail or Google's Gmail — retain whatever personal information users are required to enter when opening an e-mail account, Sullivan notes. The same holds true for anyone who signs in when using a personalized homepage from Google or Yahoo. Whatever information you provide when signing in could be linked to your search history.

And customers who buy services from a search engine might also be leaving their credit card information behind. "Technically, they can use that to find out who you are," Sullivan says.

Anonymous Browsing

Technology to help Web users protect their privacy is available. Software such as Tor and Anonymizer hides a user's IP address (the string of numbers that identifies a user's computer) from search engines by routing search requests through a maze of servers.

Tor is a free service sponsored by the Electronic Frontier Foundation; it routes your Web browsing through a variety of servers, camouflaging where the traffic originated from. Anonymizer offers limited free browsing and also sells software packages starting at $29.99. Technology experts say both services can result in slower surfing, and they note that some Web sites block anonymous browsing.

A Novel Request

Search engines receive requests for specific information on users several times a day, in both criminal probes and civil lawsuits. The Justice Department's request is different in that the information being sought is a large body of data, rather than information related to a specific individual.

While that may have allayed the privacy worries of search engines that chose to comply with the subpoenas, privacy advocates worry that personal information might be part of the search data itself. They note that people often perform vanity searches for their names, their home addresses or other personal information.

The Electronic Frontier Foundation's Opsahl says his group has urged search engines to limit the amount of information they gather about users and the length of time they store that data. "They have listened to us but there's a lot of resistance," he says.

Comments

 

Please keep your community civil. All comments must follow the NPR.org Community rules and terms of use, and will be moderated prior to posting. NPR reserves the right to use the comments we receive, in whole or in part, and to use the commenter's name and location, in any medium. See also the Terms of Use, Privacy Policy and Community FAQ.

Support comes from: