Privacy At Stake As Sites Track Online Preferences

Sites like Bit.ly, which provide a service to users by shortening URLs, also get something in return — users' browsing preferences. Bit.ly's Hilary Mason talks about the services sites hope to provide by collecting such data, and the trade-off of less privacy for a more customized online experience.

Copyright © 2011 NPR. For personal, noncommercial use only. See Terms of Use. For other uses, prior permission required.

IRA FLATOW, host:

Youre listening to SCIENCE FRIDAY from NPR. I'm Ira Flatow.

When you're browsing the Web or maybe looking through your Twitter feed, do you ever see those shortened little links, like TinyURL or Bit.ly? And did you ever click on them? Well, I'm sure you have.

But did you know that, at least in the case of those Bit.ly links, that the company is tracking what you click on? And if you shorten a link, they're tracking your shortening of that, also, to figure out what you like.

Do you go to the techie stuff, or are you an obsessive reader of the New Yorker? Maybe you're a foodie, always creating and clicking on links to restaurant reviews, because with those links, they're building this profile about you.

But if you look at the site, Bit.ly, B-I-T-.-L-Y, there's no advertising there. There's no marketing. So what is this tracking all about? Can you opt out of the tracking if you want to? Or if you're okay with someone else looking over your shoulder as you're browsing the Internet, could companies use that information to make a better, more personal Internet someday?

We've had people tell us: I like having the ads because if I'm going to have ads on my website, better to be the ads for stuff that I like, because they know where I'm going, than for stuff that's just junk I'm not going to be interested in.

How do you feel it? Our number, 1-800-989-8255. Hilary Mason, who is the lead scientist at Bit.ly, is here in New York with us. She's in our New York studios. Welcome to SCIENCE FRIDAY, Hilary.

Ms. HILARY MASON (Lead Scientist, Bit.ly): Thank you, Ira.

FLATOW: How did Bit.ly get started? What was the idea originally behind that?

Ms. MASON: So Bit.ly was actually started out of frustration. One of the engineers at Betaworks was trying to build a product around social chat, and he needed to have a way for people to share content within that product, and so he sort of hacked up Bit.ly in a day or two.

But it turned out that the product he was attempting to build was weak. People liked it, but it wasn't really useful. But Bit.ly was great. People loved it. They got it...

FLATOW: It started out just as a shortening device, right from the beginning.

Ms. MASON: Yes, right from the beginning. You put a long URL in, that's the string in your browser bar, and out comes a very short URL that you can share through email, instant message, on Twitter, Facebook, et cetera.

FLATOW: So originally, there was no marketing idea behind it, just a useful little tool?

Ms. MASON: It's the kind of thing that markets itself because it's so useful that when I use it, and you see me using it, you're going to say what is that, and you're going to go try it, and you're going to love it.

FLATOW: So it wasn't created, originally, to track people, where they went and what they did.

Ms. MASON: Not at all.

FLATOW: How did it morph into that?

Ms. MASON: I wouldn't even say that that's what Bit.ly is.

FLATOW: Tell us what Bit.ly is.

Ms. MASON: Okay, Bit.ly is a piece of infrastructure that helps people share and see content that they're interested in on the Internet. And it's only through that aggregate data, when we're able to see what hundreds of millions, or even billions, of people are sharing, that we're able to draw these pictures of what's interesting.

And that's where that data tracking comes in, but on an individual basis, we're really there to provide value for those users who want to share things.

FLATOW: So what do you do with all the information that you gain? Can you give me an idea of the physical structure, for a lack of a better word? When I click, or I shorten a URL, what does Bit.ly do? What happens on the other side?

Ms. MASON: So what we do is we create something called a hash for that URL, where we come up with a short code for it, and we generate a link for you. And if you happen to have created a short link for one of our partners, you'll actually get a branded domain. So if you shorten a New York Times article, you'll get an nyti.ms, with a few letters and numbers after that. And if not, you'll see a Bit.ly link.

And that's something you can share. Whenever anybody clicks on that, we actually track the number of clicks. So you can take any Bit.ly link in the world, anyone can do this, you can put a plus sign on the end, load that up in your browser, and you'll be able to see, in real time, people clicking on that link. We have a pretty graph that shows you, second by second, how many people have clicked that link.

FLATOW: So if I put the plus on the end of the Bit.ly link, it'll take me to a new page that shows all the people clicking on that link?

Ms. MASON: It shows you how many. It doesn't show you who they are.

FLATOW: Wow. So if I do, you know, if we had sciencefriday.com and took the Bit.ly link and put the plus on there, we'd see all the people going to our website?

Ms. MASON: Absolutely. You can try it right now.

FLATOW: And what do you get out of this, meaning Bit.ly? What does Bit.ly get out of all this?

Ms. MASON: So what we get out of this is a pretty fascinating perspective on what's being shared online. And I like to use the metaphor of a restaurant.

So if you imagine there are lots of conversations happening in a restaurant, the best service in a restaurant, is the kind of service you don't notice at all, the waiter comes in and refills your glasses, brings you food when you're ready for it.

So what we're trying to do is to gather data about what people need in that sort of environment and to provide it, invisibly, as a service.

FLATOW: And then you alert advertisers about these things so you can put advertising on the website or...?

Ms. MASON: No.

FLATOW: There's no advertising at all? I mean, New York Times is getting something. As you said, they're getting their little...

Ms. MASON: So what we're able to show our partners is not just how the content that they're sharing is doing online, but how other people are sharing their content. So we're able to show you, as a media brand, this is how people are sharing content about you online that you probably didn't even know about.

FLATOW: Do the partners pay you a fee to become partners to Bit.ly?

Ms. MASON: We have a free Bit.ly Pro product, and then we have one that does have a fee.

FLATOW: What kind of information, exactly, is interesting to the people who are collecting data? I know that you're a data wonk, and...

Ms. MASON: Everything is interesting.

FLATOW: I say that in the most positive sense.

Ms. MASON: Absolutely.

FLATOW: Right, right. What kind of data is interesting?

Ms. MASON: So this morning, we were sending links around our office, where we have some tools, internally, where we're able to see what the topics are that are trending through our dataset.

And we were looking at these stories about Egypt and what's happening in Egypt, sort of emerging and bubbling up. We could see these videos on YouTube that were trending upwards. We could see different news articles. We could see facts propagating from one news article, or even from Twitter to a news article, to a blog to another news article, and this is fascinating. We're getting this perspective on the way people share data and information that we really haven't had before.

FLATOW: Can we see that same data? Is there a spot for us to look at all the trends in what's happening?

Ms. MASON: Well, we're a young company. Bit.ly is just over two years old. And I see our greatest challenge right now is to build those products so that people can take advantage of that data.

FLATOW: Is there a way for me to use your link but to shut off the data part? I don't want to send the data to you, but I want to shorten it.

Ms. MASON: So you can shorten links anonymously. You dont have to log in. And then we also have the option where you can opt out of us keeping track of anything you click on, and that's on our website.

FLATOW: It's on your website.

Ms. MASON: Yes.

FLATOW: Because I know as an extension plug-in to some browsers, a little side panel comes up and says, you know, shows you the shortening, and you press the - you click on copy, and it copies the Bit.ly piece. Is there was a way on that extension to say don't send anything, just shorten it for me?

Ms. MASON: Yes, that's all the same thing, through our website. So we have this really cool sidebar. It lets you shorten any link on the Web. So you just click on the sidebar. It loads up on top of that page, and then it's essentially a little version of our website utilities right there. I think it's pretty cool.

FLATOW: So even if I'm anonymous, you do get my IP address, at least, right?

Ms. MASON: Yes.

FLATOW: So you just know what my - what part of the country I'm in and that sort of thing? What does the IP address tell you?

Ms. MASON: We don't ever expose IP addresses, and you can resolve an IP address in general to a non-specific location. We only expose that at the country level.

So if you go to that info page, if you add a plus, you'll be able to see, you know, this link got 400 clicks from the United States and 200 from the U.K. and three clicks from Germany, et cetera.

FLATOW: If you create a way to always show me stuff you think I'll like, wouldn't you actually be reinforcing this tendency that people have on the Internet to become more insulated within their own group?

Ms. MASON: I think we are reinforcing that tendency only if we do it wrong. So we can, if we know enough about somebody, we're able to recommend content to you, and we're able to see it coming in in real time. So we may know about something that you're interested in before you do, and times like this, when there's a really exciting news story going on, are great examples of that.

And so we're able to sort of build this model of what you like and send you things that we think you'll want to know right away, but we can also turn that on its head and say here's something we have no data about. Can we explore that a big and see if you like that, too?

So if we do our job well, we'll do it right.

FLATOW: Are you in competition with Google and people who want to keep sending you data, you know, on your Google gmail page and things like that? Are they your competitors?

Ms. MASON: Google does have their own URL shortener. So I believe strictly, we are competitors in the URL-shortening market. But I think both companies are really about a lot more than just shortening URLs.

FLATOW: What ever happened to TinyURL? Was that the first one?

Ms. MASON: TinyURL was the first one. I believe their usage has dropped off a lot. They don't provide some of the metrics features we provide, and I don't know about their reliability.

FLATOW: Besides being the instantaneous world news collector that you talked about, with Egypt and Tunisia, whatever, are there general trends where people go most of the time, general topics they're mostly interested in?

Ms. MASON: Absolutely. We see about 27 percent of the content is around technology, which makes sense because it's content being shared on social media. About 20 percent is world news.

And then we have another category we had to label ourselves, that I like to call silly things on the Internet.

FLATOW: Really?

(Soundbite of laughter)

FLATOW: People going to silly places, on the...

Ms. MASON: Pictures of kittens with comics and funny logos. We see a lot of that.

FLATOW: Mm-hmm. And what about pornography?

Ms. MASON: We see a very small amount of pornography. Our data tends to reflect the things that people are comfortable sharing socially, and so while we do have some of that, it's not a big part of the site.

FLATOW: Does that mean they're not shortening the pornography sites so you can track it? Or are they just not going there?

Ms. MASON: It says nothing about where people are actually going. It just says what they're comfortable sharing.

FLATOW: By what you read as the Bit.ly shortened URL?

Ms. MASON: Yes. The bias in our data is things people are comfortable sharing.

FLATOW: You yourself, as working at Bit.ly. Do you block the sort of data gathering about yourself? Do you have a way that you can put a little bit of code so we don't know where Hilary - what Hilary is up to?

(Soundbite of laughter)

Ms. MASON: So I'm pretty interested in this philosophically. I don't actually, as a general policy, block any sort of cookies. I keep them all turned on, and that's because I'm willing to make the tradeoff that I let companies gather this information about me in return for a better experience.

But philosophically, I'm interested in this problem, so I have done a few quick hacks to sort of explore ways to prevent yourself from being located and tracked.

FLATOW: Mm-hmm. And are they available for any of us?

Ms. MASON: Yeah. All of this code is available on my Github account.

FLATOW: And your what?

Ms. MASON: So Github is a website for sharing code, and it's up there at - it's github.com/hmason. Or you can find it on my website, hilarymason.com.

FLATOW: There you have it. The little codes of - to keep you - just like Hilary.

(Soundbite of laughter)

FLATOW: What are some of those projects?

Ms. MASON: So the one that I think is most applicable here is - I was taking a lot of photos with my cell phone around my house and around where I worked and where I was going out with my friends. And I realized that the latitude and longitude of where I took those photos was actually embedded in the EXIF data. And when I make those photos public, somebody could sort of build up a profile over time of where I am.

FLATOW: Right.

Ms. MASON: And I found that to be slightly disturbing, because I'm making the choice to make the photo public without really thinking through that someone can plot on a map and figure out where I live and I work. So I have one script that just goes through the data, and the first option is just to delete it, which is very easy to do, and tools like Flickr give you the ability to do that with one click.

FLATOW: Mm-hmm.

Ms. MASON: But I thought if someone were really trying to find me, they'd expect me to delete that data. So how could I just really mess with them? So I wrote something that figures out which city you're in. It calculates the distance of the lat-long of your photos from the center of that city, and then, it picks some other random point and says it was there. So if somebody really does try and track you via that mechanism, they just get noise.

FLATOW: Is there an ultimate piece of code you'd like to write that you haven't? Because I could see that you really enjoy this.

(Soundbite of laughter)

Ms. MASON: Oh, I love it. Absolutely.

(Soundbite of laughter)

Ms. MASON: I think we're at an incredibly exciting time for working on these kinds of data problems and using this data to sort of improve our lives. And so I like to work on a lot of projects that help us live better lives through taking advantage of our data.

FLATOW: We're talking about the Internet with Hilary Mason of Bit.ly on SCIENCE FRIDAY from NPR. I'm Ira Flatow.

Let's go to the phones. Some folks have some interesting - hi, Dave. Dave in Richmond. Welcome to SCIENCE FRIDAY.

DAVE (Caller): Thank you. I have a question about malicious content and how Bit.ly or if Bit.ly takes any sort of active steps to protect its users from any kind of malicious content on the Web when, you know, you shorten those URLs, and you cannot see the domain name and things like that anymore. You know, what kind of steps Bit.ly is taking to protect its users from that kind of thing?

FLATOW: All right. Dave, thanks for calling.

Ms. MASON: That's actually a great question. We spend a lot of time and energy on making sure that all Bit.ly links are safe to click through. We work pretty closely with several partners where we get the standard blacklists of known malicious content on the Internet. What's interesting about our data is that it's too fresh, like it's coming in as it's created.

So we have this window of time where we actually have links that could be malicious where these other services haven't blocked them yet. So we've developed our own in-house technology that can find the malicious link, figure out what makes that statistically interesting. And then, as links are shortened, they go on a queue where they're all checked for malicious content.

And if you attempt to click through a malicious content on Bit.ly, you'll get a page that says stop - this could be really bad. And that seems to do a very good job of convincing people not to go there.

FLATOW: Interesting. Now, let's say that one of the link-shortening companies goes belly up. I'm not saying Bit.ly or whatever. What happens to all those links that you've created?

Ms. MASON: That's a great question. So we have created a redundant system such that if anything happened to our servers, we can point at that redundant flat file system and all of those links will still work. We're also working in collaboration with the Internet archive on a project called 301Works - and it's 301Works.org - where we're backing this data up, because we really do not want to break the Internet.

(Soundbite of laughter)

FLATOW: You hate it when that happens.

(Soundbite of laughter)

Ms. MASON: Absolutely.

FLATOW: Do you see the Internet morphing into anything newer? I mean, do you see, as a coder or some other - something where you'd like to see it go or any kind of - sort of like the blank check to someone like, say, if you could do anything, you had all the money to do something, what would you do with it?

(Soundbite of laughter)

FLATOW: And what would you do with it in your kind of work? What would you like to know or like to create?

Ms. MASON: Well, I think, as I said, this is the most exciting time for working with this kind of technology and data, and I think we're going to a place where we have our computers with us all the time. The computer is acting as an agent on our behalf.

FLATOW: Mm-hmm.

Ms. MASON: And if you think about all the tasks that we take - that we take part in every day, a lot of them are cognitive drudgery. They're solving the same problems we've already solved, or they're solving trivial problems. Even like arranging where to meet a friend for dinner can take 15 minutes of conversation, and that's 15 minutes you're not actually talking to your friend. So I'd like to see us get to a place where this can be solved automatically, so we can go on being human.

FLATOW: So this sort of an automated process where you say where are we going -I want to know where we're going to dinner and something takes over and rounds you all up and then says here's the spot you all agreed to?

(Soundbite of laughter)

Ms. MASON: Rather than taking over, something can make a recommendation that you as a human can approve or not approve. I think that we will see technology in many different forms. And if you look at - like what Microsoft accomplished with the connect interface, we'll see that in mobile in the next few years, where you just sort of have to vaguely gesture at something and it understands what you want.

FLATOW: And you're excited about working (unintelligible).

Ms. MASON: I'm incredibly excited about it.

FLATOW: Yeah. And automating things, the sky is the limit.

(Soundbite of laughter)

Ms. MASON: Yes.

FLATOW: And how many people work at Bit.ly?

Ms. MASON: Bit.ly, we're up to about 20 people right now.

FLATOW: Twenty people. They all sit in there, typing in those little shorthand things or...

Ms. MASON: Yes. So we just upgraded to smaller desks to fit more people in.

(Soundbite of laughter)

FLATOW: All right. Thank you very much, Hilary, for taking - I learned a lot today. I'm sure our listeners have also.

Ms. MASON: Oh, thank you.

FLATOW: And good luck at Bit.ly.

Hilary Mason is the lead scientist at Bit.ly, Bit.ly. What does the ly stand for?

Ms. MASON: The ly comes from one of those country codes, so they were originally intended to be countries. So the ly comes from Libya. But you'll see a lot of Internet startups using things like .it from Italy or .us from the United States.

FLATOW: Just to get the name.

Ms. MASON: Because it's really cool.

FLATOW: And there you have it. The whole raison d'etre. It's really cool.

Hilary Mason, lead scientist at Bit.ly, thanks again for taking time to be with us today.

(Soundbite of laughter)

Ms. MASON: Thank you.

FLATOW: We're going to continue talking about the Internet after the break, talking about social media and the uses of it in Tunisia and Egypt and other places. So stay with us. We'll be right back after this break.

(Soundbite of music)

FLATOW: I'm Ira Flatow. This is SCIENCE FRIDAY from NPR.

Copyright © 2011 NPR. All rights reserved. No quotes from the materials contained herein may be used in any media without attribution to NPR. This transcript is provided for personal, noncommercial use only, pursuant to our Terms of Use. Any other use requires NPR's prior permission. Visit our permissions page for further information.

NPR transcripts are created on a rush deadline by a contractor for NPR, and accuracy and availability may vary. This text may not be in its final form and may be updated or revised in the future. Please be aware that the authoritative record of NPR's programming is the audio.

Comments

 

Please keep your community civil. All comments must follow the NPR.org Community rules and terms of use, and will be moderated prior to posting. NPR reserves the right to use the comments we receive, in whole or in part, and to use the commenter's name and location, in any medium. See also the Terms of Use, Privacy Policy and Community FAQ.