The Virtue of Forgetting in the Digital Age

by Viktor Mayer-schonberger

Hardcover, 237 pages, Princeton Univ Pr, List Price: $24.95 | purchase

Purchase Featured Book

The Virtue of Forgetting in the Digital Age
Viktor Mayer-schonberger

Your purchase helps support NPR Programming. How?

Note: Book excerpts are provided by the publisher and may contain language some find offensive.

Excerpt: Delete


The Virtue of Forgetting in the Digital Age

Princeton University Press

Copyright © 2009 Princeton University Press
All right reserved.

ISBN: 978-0-691-13861-9


Acknowledgments.................................................................................ixCHAPTER I Failing to Forget the "Drunken Pirate"................................................1CHAPTER II The Role of Remembering and the Importance of Forgetting.............................16CHAPTER III The Demise of Forgetting—and Its Drivers......................................50CHAPTER IV Of Power and Time—Consequences of the Demise of Forgetting.....................92CHAPTER V Potential Responses...................................................................128CHAPTER VI Reintroducing Forgetting.............................................................169CHAPTER VII Conclusions.........................................................................196Notes...........................................................................................201Bibliography....................................................................................219Index...........................................................................................233

Chapter One

Failing to Forget the "Drunken Pirate"

Stacy Snyder wanted to be a teacher. By spring of 2006, the 25-year-old single mother had completed her coursework and was looking forward to her future career. Then her dream died. Summoned by university officials, she was told she would not be a teacher, although she had earned all the credits, passed all the exams, completed her practical training—many with honors. She was denied her certificate, she was told, because her behavior was unbecoming of a teacher. Her behavior? An online photo showed her in costume wearing a pirate's hat and drinking from a plastic cup. Stacy Snyder had put this photo on her MySpace web page, and captioned it "drunken pirate," for her friends to see and perhaps chuckle over. The university administration, alerted by an overzealous teacher at the school where Stacy was interning, argued that the online photo was unprofessional since it might expose pupils to a photograph of a teacher drinking alcohol. Stacy considered taking the photo offline. But the damage was done. Her page had been catalogued by search engines, and her photo archived by web crawlers. The Internet remembered what Stacy wanted to have forgotten.

Stacy later unsuccessfully sued her university. She alleged that putting the photo online was not unprofessional behavior for a budding teacher. After all, the photo did not show the content of the plastic cup, and even if it did, Stacy, a single mother of two, was old enough to drink alcohol at a private party. This case, however, is not about the validity (or stupidity) of the university's decision to deny Stacy her certificate. It is about something much more important. It is about the importance of forgetting.

Since the beginning of time, for us humans, forgetting has been the norm and remembering the exception. Because of digital technology and global networks, however, this balance has shifted. Today, with the help of widespread technology, forgetting has become the exception, and remembering the default. How and why this happened, what the potential consequences are for us individually and for our society, and what—if anything—we can do about it, is the focus of this book.

For some, Stacy Snyder's case may sound exceptional, but it is not. Dozens of cases of profound embarrassment, and even legal action, have occurred since then—from the attorney who cannot get the Internet to forget an article in a student newspaper more than a decade ago to a young British woman who lost her job because she mentioned on Facebook that her job was "boring." By 2008, more than 110 million people had individual web pages on MySpace, just like Stacy Snyder. And MySpace wasn't the only game in town. Facebook, MySpace's direct competitor, had created 175 million pages online for individual users by early 2009. Facebook and MySpace are primarily focused on the U.S. market (although this is changing), but the phenomenon is not a purely American one. Social networking site Orkut, owned by Google, has over 100 million users, mostly in Brazil and India. A good dozen other sites around the world account for at least another 200 million users. These numbers reflect a more general trend. The first years of the Internet surge, culminating in the dot-com bubble and its burst, were all about accessing information and interacting with others through the global network (call it Web 1.0, if you want). By 2001, users began realizing that the Internet wasn't just a network to receive information, but one where you could produce and share information with your peers (often termed Web 2.0). Young people especially have embraced these Web 2.0 capabilities. By late 2007, Pew Research, an American organization surveying trends, found that two out of three teens have "participated in one or more among a wide range of content-creating activities on the Internet," with more girls creating (and sharing) content than boys. On an average day, Facebook receives 10 million web requests from users around the world every second. As professors John Palfry and Urs Gasser have eloquently detailed, disclosing one's information—whether these are Facebook entries, personal diaries and commentaries (often in the form of blogs), photos, friendships, and relationships (like "links" or "friends"), content preferences and identification (including online photos or "tags"), one's geographic location (through "geo-tagging" or sites like Dopplr), or just short text updates ("twitters")—has become deeply embedded into youth culture around the world. As these young people grow older, and more adults adopt similar traits, Stacy Snyder's case will become paradigmatic, not just for an entire generation, but for our society as a whole.

Web 2.0 has fueled this development, but conventional publishing—paired with the power of the Internet—has rendered surprisingly similar results. Take the case of Andrew Feldmar, a Canadian psychotherapist in his late sixties living in Vancouver. In 2006, on his way to pick up a friend from Seattle-Tacoma International Airport, he tried to cross the U.S./Canadian border as he had done over a hundred times before. This time, however, a border guard queried an Internet search engine for Feldmar. Out popped an article Feldmar had written for an interdisciplinary journal in 2001, in which he mentioned he had taken LSD in the 1960s. Feldmar was held for four hours, fingerprinted, and after signing a statement that he had taken drugs almost four decades ago, was barred from further entry into the United States.

Andrew Feldmar, an accomplished professional with no criminal record, knows he violated the law when he took LSD in the 1960s, but he maintains he has not taken drugs since 1974, more than thirty years before the border guard stopped him. For Feldmar, it was a time in his life that was long past, an offense that he thought had long been forgotten by society as irrelevant to the person he had become. But because of digital technology, society's ability to forget has become suspended, replaced by perfect memory.

Much of Stacy Snyder's pain, some say, is self-inflicted. She put her photo on her web page and added an ambiguous caption. Perhaps she did not realize that the whole world could find her web page, and that her photo might remain accessible through Internet archives long after she had taken it offline. As part of the Internet generation, though, maybe she could have been more judicious about what she disclosed on the Internet. This was different for Andrew Feldmar, however. Approaching seventy, he was no teenage Internet nerd, and likely never foresaw that his article in a relatively obscure journal would become so easily accessible on the worldwide Net. For him, falling victim to digital memory must have come as an utter, and shocking, surprise.

But even if Stacy and Andrew had known, should everyone who self-discloses information lose control over that information forever, and have no say about whether and when the Internet forgets this information? Do we want a future that is forever unforgiving because it is unforgetting? "Now a stupid adolescent mistake can take on major implications and go on their records for the rest of their lives," comments Catherine Davis, a PTA co-president. If we had to worry that any information about us would be remembered for longer than we live, would we still express our views on matters of trivial gossip, share personal experiences, make various political comments, or would we self-censor? The chilling effect of perfect memory alters our behavior. Both Snyder and Feldmar said that in hindsight they would have acted differently. "Be careful what you post online," said Snyder, and Feldmar added perceptively "I should warn people that the electronic footprint you leave on the Net will be used against you. It cannot be erased." But the demise of forgetting has consequences much wider and more troubling than a frontal onslaught on how humans have constructed and maintained their reputation over time. If all our past activities, transgressions or not, are always present, how can we disentangle ourselves from them in our thinking and decision-making? Might perfect remembering make us as unforgiving to ourselves as to others?

Still, Snyder and Feldmar voluntarily disclosed information about themselves. In that strict sense, they bear responsibility for the consequences of their disclosures. Often, however, we disclose without knowing.

Outside the German city of Eisenach lies MAD, a mega-disco with space for four thousand guests. When customers enter MAD, they have to show their passport or government-issued ID card; particulars are entered into a database, together with a digital mug shot. Guests are issued a special charge card, which they must use to pay for drinks and food at MAD's restaurant and many bars. Every such transaction is added to a guest's permanent digital record. By the end of 2007, according to a TV report, MAD's database contained information on more than 13,000 individuals and millions of transactions. Sixty digital video cameras continuously capture every part of the disco and its surroundings; the footage is recorded and stored in over 8,000 GB of hard disk space. Real-time information about guests, their transactional behavior, and their consumption preferences are shown on large screens in a special control room resembling something from a James Bond movie. Management proudly explains how, through the Internet, local police have 24/7 online access to customer information stored on MAD's hard disks. Few if any of the disco's guests realize their every move is being recorded, preserved for years, and made available to third parties—creating a comprehensive information shadow of thousands of unsuspecting guests.

For an even more pervasive example, take Internet search engines. Crawling web page by web page, Google, Yahoo!, Microsoft Search,, and a number of others index the World Wide Web, making it accessible to all of us by simply typing a word or two into a search field. We know and assume that search engines know a great deal of the information that is available through web pages on the global Internet. Over the years, such easy-to-use yet powerful searches have successfully uncovered information treasures around the globe for billions of users. However, search engines remember much more than just what is posted on web pages.

In the spring of 2007, Google conceded that until then it had stored every single search query ever entered by one of its users, and every single search result a user subsequently clicked on to access it. By keeping the massive amount of search terms—about 30 billion search queries reach Google every month—neatly organized, Google is able to link them to demographics. For example, Google can show search query trends, even years later. It can tell us how often "Iraq" was searched for in Indianapolis in the fall of 2006, or which terms the Atlanta middle class sought most in the 2007 Christmas season. More importantly, though, by cleverly combining login data, cookies, and IP addresses, Google is able to connect search queries to a particular individual across time—and with impressive precision.

The result is striking. Google knows for each one of us what we searched for and when, and what search results we found promising enough that we clicked on them. Google knows about the big changes in our lives—that you shopped for a house in 2000 after your wedding, had a health scare in 2003, and a new baby the year later. But Google also knows minute details about us. Details we have long forgotten, discarded from our mind as irrelevant, but which nevertheless shed light on our past: perhaps that we once searched for an employment attorney when we considered legal action against a former employer, researched a mental health issue, looked for a steamy novel, or booked ourselves into a secluded motel room to meet a date while still in another relationship. Each of these information bits we have put out of our mind, but chances are Google hasn't. Quite literally, Google knows more about us than we can remember ourselves.

Google has announced that it will no longer keep individualized records forever, but anonymize them after a period of nine months, thereby erasing some of its comprehensive memory. Keeping individualized search records for many months still provides Google with a very valuable information treasure it can use as it sees fit. And once the end of the retention period has been reached, Google's pledge is only to erase the individual identifier of the search query, not the actual query, nor the contextual information it stores. So while Google will not be able to tell me what terms I searched for and what search results I clicked on five years ago, they may still be able to tell me what a relatively small demographic group—middle-aged men in my income group, and owning a house in my neighborhood—searched for on the evening of April 10 five years ago. Depending on group size, this could still reveal a lot about me as an individual. And in contrast to Stacy Snyder and Andrew Feldmar, few of us know that Google keeps such a precise record of our searches.

Google is not the only search engine that remembers. Yahoo!, with about ten billion search queries every month, and the second largest Internet search provider in the world, is said to keep similar individual records of search queries, as does Microsoft.

Search engines are a powerful example of organizations that retain near perfect memory of how each one of us has used them, and they are not shy to utilize this informational power. But other organizations, too, collect and retain vast amounts of information about us. Large international travel reservation systems used by online travel web sites, like Expedia or Orbitz, as well as by hundreds of thousands of traditional travel agents around the world, are similarly remembering what we have long forgotten. Each and every flight reservation made through them is stored in their computers for many months, even if we never actually booked the flight. Their records can tell six months after we planned our last vacation what destination and flight options we pondered, or whom we wanted to come along (although that person may never have made it, and may never have known she was considered). They remember what we have long forgotten.

Credit bureaus store extensive information about hundreds of millions of U.S. citizens. The largest U.S. provider of marketing information offers up to 1,000 data points for each of the 215 million individuals in its database. We also see the combination of formerly disparate data sources. Privacy expert Daniel Solove describes a company that provides a consolidated view of an individual with information from 20,000 different sources across the globe. It retains the information, he writes, even if individuals dispute its accuracy. Doctors keep medical records, and are under economic and regulatory pressure to digitize and commit decades of highly personal information to digital memory. And it is not just the private sector that aims for perfect memory. Law enforcement agencies store biometric information about tens of millions of individuals even if they have never been charged with a crime, and most of these sensitive yet searchable records are never deleted.

Neither is the United States alone in creating a digital memory that vastly exceeds the capacity of our collective human mind. In the United Kingdom alone, 4.2 million video cameras survey public places and record our movements. So far, limits in storage capacity and face recognition capabilities have constrained accessibility, but new technology will soon be used to identify individuals in real time (as the BBC reports—this referenced technology is rumored to have been pioneered by Las Vegas Casinos).