Libraries Enter the Digital Age Several organizations are working to digitize and make available online all the information that might be hiding on the shelves of libraries around the world. Leaders in the digital libraries community talk about how they intend to go about putting every book published online.
NPR logo

Libraries Enter the Digital Age

  • Download
  • <iframe src="" width="100%" height="290" frameborder="0" scrolling="no" title="NPR embedded audio player">
  • Transcript
Libraries Enter the Digital Age

Libraries Enter the Digital Age

  • Download
  • <iframe src="" width="100%" height="290" frameborder="0" scrolling="no" title="NPR embedded audio player">
  • Transcript


You're listening to TALK OF THE NATION: SCIENCE FRIDAY. I'm Ira Flatow.

Should all the knowledge, opinion, prose, music, film, video and published works, everything ever written, photographed, composed, conducted, performed, watched or read, should all of those works created by generations of writers, musicians, composers, artists, scientists, you know, whomever else I'm leaving out, should all of it be available anytime to anyone to read, watch, or listen? What do you think?

This is a question that's being asked and answered by people who are actually collecting and cataloging all that material, making them available in different ways in this new digital age we live in.

And there are many issues of privacy, property, selection and censorship that are at the heart of these efforts. What's should be catalogued? Do you have the right legally to do so? Who should have those rights? What about censorship and the abridgement of the works of fact and fiction? Should you be able to edit those things the way you see fit?

So, for the rest of the hour, we'll be talking about projects to create digital libraries with three of the movers and shakers behind those efforts. And if you'd like to talk about it, our number is 1-800-989-8255, 1-800-989-TALK.

Brewster Kahle is digital librarian, director and co-founder of the Internet Archive. You may know it as in San Francisco. He joins us from the studios of KQED. Welcome to the program.

Mr. BREWSTER KAHLE (Digital Librarian; Director and Co-founder, The Internet Archive): Thank you very much.

FLATOW: You're welcome. Michael S. Hart is the founder of Project Gutenberg, a volunteer effort to produce electronic books. He joins us from the studios of WILL in Urbana, Illinois. Welcome to Science Friday.

Mr. MICHAEL S. HART (Founder, Project Gutenberg): Howdy from Urbana.

FLATOW: Howdy.

Michael Keller is a university librarian, director of academic information resources at Stanford University in Stanford. He joins us from a studio on the campus. Welcome to the program.

Mr. MICHAEL KELLER (University Librarian; Director of Academic Information Resources, Stanford University): Good morning, Ira. Thank you very much.

FLATOW: Let me start with the oldest, Michael. would you say you're the oldest, Michael Hart the cataloging person here at the Gutenberg project? You bill yourself as the original e-book person?

Mr. HART: I guess I am. Nobody else ever steps forward and says they did it before that.

FLATOW: Tell us about the project, how it started and what it's up to these days?

Mr. HART: Well, it was a lot of luck, being the right person at right place at the right time. I just happened to be hanging around one of the early computers here just a couple blocks away from where I am now. And I kind of figured out how to use it by watching and came back from the Fourth of July fireworks one night and found out they had given me an account to play with. And I thought I should do something with it because they had put $100,000 or $100 million dollars in there, some ridiculous amount for a college student's perspective.

And as luck would have it, on the way over I had picked up something to eat. They had put a copy of the Declaration of Independence in my backpack with my munchies. And when that fell out, the light literally went on over my head and I knew what I had to do. I typed it in on this 50-year-old Teletype machine and it became the very first file in Project Gutenberg.

FLATOW: What year was that?

Mr. HART: That was 1971.

FLATOW: Wow. And so you be - you started hand typing everything?

Mr. HART: Oh, it was worse than hand typing. You can't type very fast on a Teletype machine.

(Soundbite of laughter)

FLATOW: Gee, little paper tape came out the other side of it?

Mr. HART: That's right. A six-bit ASCII tape.

FLATOW: I remember that - I remember those days.

Mr. HART: I still have one in my basement.

FLATOW: So what, what are you doing now? Where have you come now - what, almost 40 years later?

Mr. HART: Project Gutenberg now has over 100,000 e-books, almost 25,000 that we have prepared ourselves and 75,000 or 80,000 that have been donated by over 150 electronic libraries around the world.

FLATOW: Now, are these all available to anybody or are they copyright? What do you - how do you deal with the copyright issues on this?

Mr. HART: We have permission to do the copyrighted ones. We have several thousand copyrighted ones in addition to a great many public domain works.

FLATOW: Brewster Kahle, let's talk about your project. How does yours compare to the - to Michael's?

Mr. KAHLE: Oh, we see ourselves as dovetailing with Michael's, and actually we source a lot of materials to Michael's project. The Open Content Alliance and the Internet Archive are with working with libraries around the world to take images of the pages of books and then put those through a set of processing steps so that you can run optical character recognition so that you can search those books and create searchable downloadable, printable books that look like the original book.

And the Project Gutenberg volunteers take some of those books and correct the faulty optical character recognition that the machines do to make beautiful electronic books that could go on to handheld devices. We see ourselves as part of the whole ecology of bringing books from libraries, making open public copies of them available for many, many different types of uses.

FLATOW: And you use all copyrighted material too?

Mr. KAHLE: The Internet Archive is working mostly on out-of-copyright works now to work through some of the issues of what libraries do, how does all the technology work. We're scanning about 12,000 books per month in eight cities, in eight libraries, in three countries. So, we now have the infrastructure for doing very cost-effective mass digitization in the open. Our next step beyond going with out-of-copyright is to go for orphan works.

These are things that are under copyright but you can't find anybody to answer the phone or license it. They're kind of lost. They're orphans, they're sort of wandering the desolate wasteland of our intellectual property regime. And those - that's the next focus for us and we're working through the courts and through the legislature to make sure that libraries in the digital world will be allowed to have out of print works on their digital bookshelves.

FLATOW: This is a legal problem you're working through.

Mr. KAHLE: Yes, but it's working out pretty well in the orphan works and out-of-print works area. There was a call by the copyright office to get people's input as to what should we do with these works because in 1976, the United States Congress expanded copyright massively and it caused a problem especially in this digitization world where we don't want things just on physical bookshelves and book in libraries anymore, we want them online. So, at least the out of print works looks like that's in pretty good shape. The in-print works…

FLATOW: Now, Michael. I'm sorry, go ahead.

Mr. KAHLE: The in-print works are largely coming up from the publishers themselves because they're commercially viable. And so those are starting to be reformatted and you'll see announcements out of major bookstores in the future.

FLATOW: Now, Michael Keller, you're part of the Google effort, right? Google is looking to scan everything, copyrighted or not copyrighted.

Mr. KELLER: Well, it is true that we are a part of the Google Book Search project. But let me put it in a slightly different context and give you some more details on the thing.

The first is here at Stanford we've been digitizing materials for the last actually 20 or 30 years too. A couple of examples are digitizing the archive of the General Agreement on Tariff and Trade, which is the predecessor organization to the World Trade Organization. Ultimately as those nations derestrict those documents, the history - the documentation for the history of the global trading scheme up to the formation of the WTO will be available for study.

The other thing, another interesting project we're working with is with our partners at Corpus Christi College Cambridge, digitizing manuscripts dating from the 6th century to the 15th century, illustrating the, essentially, the separate lineage of the Church of England from the Church of Rome. Those documents will be available on the Web, some of them shortly in good enough resolution for people to work on those manuscripts from wherever they happen to be in the world.

And then, of course, the big - the big project is Google Book Search project. They have digitized well over a million titles to date, a great many of them public domain titles and they will be providing indexing for millions of other titles as this project proceeds.

The contribution Stanford is making to that, presently, is supplying books from our collections, books that are either public domain or out of copyright. Books in amount of about anywhere from 1,500 to 3,000 books a day go over to Mountain View for scanning, and then they go through indexing and quality control processes, ultimately leading to both the indexing, which is the main thrust of the project, and in the cases of books where there are public domain then Google displays the pages.

FLATOW: Is Google facing any copyright issues from scanning books that are still in copyright?

Mr. KELLER: Google has been sued by the Author's Guild and five publishers representing the Association of American Publishers. The scanning of copyrighted works in order to perform the indexing is the major issue and we have - the suit is in progress. Many of us have been asked to submit documents and are facing depositions and whatnot. It's a long complex process. I'm relatively sure that it's not going to be over soon.

FLATOW: What is the point of it?

Mr. KELLER: The point of the project is this and I illustrate it in two different ways. The point is that we have locked up on in these books and in these big collections of books, as both Michael Hart and Brewster Kahle have demonstrated with their projects, a great deal of information and knowledge that is difficult to discover. It's difficult to discover because our ordinary cataloguing routines provide very brief descriptions and very brief numbers of subject catalog, subject entries to identify what's in the books.

So, when we digitized our card catalog at Stanford over the period '87 to '97, at the end of that period, we noticed that we're circulating a lot more books to the Stanford faculty and student body. That means that they were able to use the keyword indexing that we were providing just on those meta data, on those catalog entries to find more books of relevance.

So, we think that if they're able to search these books word by word, they'll be able to discover lots more books of relevance, either in our collections or other collections.

A side benefit of that project is that we are getting back from Google, digital files that we will preserve, we will preserve digitally to hedge against the factors of acid paper and overuse and so forth that ultimately destroys a great many books.

FLATOW: Google has been open to criticism in the past about censorship. I'm recalling that agreement with China, in which Google agreed to censor information the Chinese did not want their people to read. Would that hold true of this project also?


FLATOW: Have they said that they will no longer - in other words, let's say I search for a passage of a book about the Falun Gong group which China has been very vocal against and would Google allow people in China to see a passage of that book?

Mr. KELLER: Ira, I don't know the answer to the question. I do know that the Chinese have filtering services, un-services, perhaps and people, all working night and day to filter those kinds of inquiries anyway. It's probably not a Google function at all. It's probably a function of some larger state-run censoring operation in China.

FLATOW: But as far as you know there is no agreement by Google to keep those kinds of requests outside their domain?

Mr. KELLER: As far as I know there's no such agreement.

FLATOW: I've been asking Google for that and we haven't gotten a word back on that either.

Mr. KELLER: Well, don't take me as authoritative, of course. As far…

FLATOW: Well, actually, actually Google put you up here. We asked Google for reference.

Mr. KELLER: Yeah. Yeah. Yeah. But that doesn't mean that I read their stuff.

FLATOW: I asked somebody for somebody to comment for Google and your name came up on the list.

Mr. KELLER: But I am specifically saying that I'm not representing Google. I'm representing Stanford. I'm commenting but I'm not representing Google.

FLATOW: I understand. Well, when those Google partners want to come on, we'll be happy to have them.

We're talking about books this hour and digital archiving of them on TALK OF THE NATION: SCIENCE FRIDAY from NPR News. I'm Ira Flatow talking with Brewster Kahle, Michael Hart and Michael Keller.

Brewster, you have been quoted as abhorring the commercialization of these kinds of projects.

Mr. KAHLE: I am not - I'm not actually against the commercialization. I'm against the locking up of the public domain. The public domain is small enough as it stands. What we want is people to thrive on it, use it, cut it up, make new and different things. And there are commercial projects out there that are putting restrictions on the public domain and that I think is not something that we in the library world should really put ourselves behind.

FLATOW: Give me, give me an idea of some of those projects that you're talking about and how they're putting restrictions on it?

Mr. KAHLE: Well, I think Michael Keller can answer some more about what's in the agreement between Stanford and Google. But there are restrictions between the University of California and Google such that the works can pretty much not be downloaded in bulk for - even for research and educational use by the general public. That basically Google will turn you off as they've told people or even if it's research and educational use of public domain materials from their servers, and they've restricted at least the University of California and University of Michigan from offering those works to the general public for bulk use.

FLATOW: Michael Keller, is that true?

Mr. KELLER: It is true. But let's be sure to understand that Brewster's version of the ideal, while it is clearly desirable may be, so baldly spoken, the enemy of the good. The truth is that most citizens do not get the opportunity to read a great many public documents, indeed, including our own government documents. And these projects, all of them, but especially Google because of the indexing of the mass of materials, provide access so citizens can read. They can at least read the documents. And I believe there are some facilities for some of the documents to be downloaded one upon one for use.

I think we have to look at how these projects are benefiting. It's certainly appropriate to have - to raise the bar, to raise expectations to make requests but the same time, we have to recognize that Google is making a fantastic investment, a gigantic investment in making these documents, which otherwise would not be seen at all - except by the few that could get to our libraries -visible to citizens all over the world.

FLATOW: They must think they're going to get a return on it. These are very smart guys.

Mr. KELLER: They probably will get a return on it in the form of additional traffic to their site but - and they may get a return on it by appending advertisements to those entries that have to do with books that are for sale, presumably under agreement with the publishers. That return, of course, is a small price to pay which most of us don't pay at all for the ability to read and the ability to use that great big index.

FLATOW: Brewster, any reply?

Mr. KAHLE: I think let's focus on what the dream is and there are going to be some pieces for the commercial guys to go and make indexes and distribute information in new and different ways and I'm all for that.

But the dream is universal access to all knowledge. It's building the great library and actually your conversation with the Encyclopedia of Life and their piece of that which is the Biodiversity Heritage Library, I think it's a very shining example of what we actually want to build here.

The Biodiversity Heritage Library is scanning works from about 10 different libraries around the world of all the information about species to make that available on the World Wide Web for free download.

What was demonstrated two days ago at the Smithsonian was the early works of this, about a million pages of information from works from all over the world, were then taken by a completely different organization, and the names of plants and animals were pulled out from those and the locations of those were cited in and made new programs, a new Web site was made available.

FLATOW: All right. First I have to tell you to hang on. We will be right back after this short break.

I'm Ira Flatow. This is TALK OF THE NATION: SCIENCE FRIDAY from NPR news.

(Soundbite of music)

FLATOW: Our health care system is ailing but it is terminal can that be saved? I'm Ira Flatow. Join me on SCIENCE FRIDAY as the talk turns to solutions, including a plan backed by big business that would insure everyone. Plus a new bill offers protection against genetics discrimination. What could it mean for patients? That's on TALK OF THE NATION from NPR news.

(Soundbite of music)

FLATOW: We're talking this hour about electronic books and electronic libraries with Brewster Kahle of the Internet Archive, Michael Hart, founder of the Project Gutenberg, Michael Keller, university librarian at Stanford University.

Our number 1-800-989-8255. When I rudely interrupted Brewster, he was telling us about his vision, his dream for this project. Go ahead; you can continue, Brewster.

Mr. KAHLE: The idea of universal access to all knowledge is within our grasp. We just have to stay true to it. And the new and different uses of these materials, I don't think we want to lock up. So having public domain materials be locked up makes no sense. There are new and different ways that people are using the information on Wikipedia, on the Internet archive, on the World Wide Web in general that I don't think we should put under corporate control. Certainly not yet.

FLATOW: (Unintelligible)

Mr. KELLER: So Brewster…

FLATOW: Go ahead.

Mr. KELLER: Would you rather not have those documents available at least so that people could read them online one by one and search many of them? Would you rather that we wait until the perfect scheme has been evolved?

Mr. KAHLE: I think we're getting there actually quite rapidly. One thing that Google did show is that it's very inexpensive to scan books. We found that with the libraries that we're working with that it's about 10 cents per page, so it's about $30 a book to digitize a book in such a way that it would be searchable and available on multiple sites around the world.

So, what would it cost to do a million books? If it's $30 a book, it's about $30 million. And the library system in the United States is about a $12 billion a year industry, if you will. So, $30 million is not that much to get a million books up. You say, well, a million books isn't enough. We need 10 million books. Then that's, with current technology, $300 million which is still a small percentage of the yearly fees that we as citizens spend on our libraries.

I see a purpose for our library system going forward for digital services coming from our libraries themselves, as well as from search engines and new and different things that we haven't even dreamed of. That's the world that we're trying to help build.

Mr. HART: Okay, Ira, can I get a word in here?

FLATOW: Absolutely, go ahead, Michael.

Mr. HART: I haven't said anything for 18 minutes. I thought I'd like to…

FLATOW: You can. I was hoping you would push through the fog here. Go ahead.

Mr. HART: The issue of the public domain a hundred years ago, half of everything that had ever been copyrighted had expired into the public domain. Under the current law, a hundred years from now, that will be half a percent, it will be virtually nothing of the entire corpus of the human work that's in the public domain.

Now in terms of Brewster's ideal, he and I agree on so much of almost everything that I almost hesitate to say, but the one difference we really have is that he doesn't - he talks about $300 million and Project Gutenberg has never spent a million dollars in doing everything because we have an army of 50,000 volunteers around the world that do all of our work.

Our goal is for people to be able to actually own their own personal library in the same way that they own their own personal computer. I'm wearing around my neck a little four gigabyte RAM stick - flash drive, thumb drive, whatever you want to call them - and it will hold about 10,000 books. Two days ago, I bought three of these at one of the local box stores for under $100. That's enough space to hold all the books in the average United States public library, 30,000 books and the total weight is 1 ounce. No batteries needed, you just carry it with you. You can stick it in your iPod or anything else that has a USB port, it's just ready to go.

Everything can be put on there, you can download the Human Genome if you want, we have three editions of that. You can download pi to a million places. You can download the complete works of Shakespeare. You could put a thousand copies of the complete works of Shakespeare on here if that was your goal.

And the same kind of thing for hardware, hardware. We bought a terabyte of outboard hard drive this week for $256. We bought a terabyte of inboard hard drive for $216.

FLATOW: That's a thousand billion bytes, right?

Mr. HART: Yeah, it's enough to hold a million books uncompressed - I was using compression on my little RAM stick. And if you use compression you could download pretty much every one of the two and half million books that are on the Internet.

And this counts only the ones that are freely available. We're not counting the ones that Google has scanned that they won't let anybody have.

FLATOW: So Michael, how would you sum up what you're saying? What's the point you're trying to make here?

Mr. HART: Well, I'm answering the question that you asked Brewster, which was what's the ideal. And to me, the ideal is for everyone, by the year 2021, to be able to own a library - not of million books - a billion books. Now, let me run that by you slowly because everybody hates me when I use big numbers. But Brewster and Mr. Keller will tell you that there are 20-some-odd million books in the public domain.

When we do half of those - and the rate will increase until we get to half of them - then we'll have to search the attics and basements harder - but we will get to 10 million of those relatively quickly. It would be faster every year. And that's with or without Google, with or without anybody, it's going to happen. Just look at how fast it's been growing already.

Now, once we get to 10 million, all you have to do is translate every one of those books into a hundred different languages, and you have a billion-book library.

FLATOW: All right, let me go to - because we're running out of time. I wanted - there are a lot of people who have questions. Let's go to Steven(ph) in Raleigh, North Carolina. Hi, Steven.

STEVEN (Caller): Hi. I'm interested in your method of storage. You know, lots of effort is going into the digitizing and archiving and collecting the information. And - like how are you ensuring the long-term decades of data integrity for each bit of what's stored in your archive itself that everybody pulls from to have their little stick of information in front of them? Like, are you using multiple drives or tape or, you know, what is it that's going to stick around for decades or centuries?

Mr. HART: Well, I'd like to quote Linus Torvalds on that, the inventor of Linux. He said real men don't use backups. They put their stuff on FTP and let the whole world download it.

FLATOW: Brewster, do you agree with that?

Mr. KAHLE: We do believe in the idea of having multiple copies. If there's one lesson from the Library of Alexandria version one, it's don't just have one copy. So we've gone and made a copy of all of that we have and put it in, actually, the new Library of Alexandria in Egypt, and also a partial copy in Amsterdam. So the idea is to not just have things in one fault zone, say, in San Francisco, but say in the flood zone of Amsterdam and then the Middle East.

And if we had multiple copies of the Library of Alexandria 2000 years ago, we would have the works of Aristotle now.

FLATOW: Let's go to Don(ph) in Tulsa. Hi, Don. Hi. Welcome to SCIENCE FRIDAY, Don. Don, are you there?

Mr. KELLER: This is Mike Keller. Let me tell you what we're doing in Stanford.

FLATOW: Go ahead, Mike.

Mr. KELLER: We are - we had built out two different methodologies. One is called LOCKSS, which is an acronym for Lots Of Copies Keep Stuff Safe. Essentially, that creates, at very little cost, multiple nodes that share content and then perfect that content when the mother files disappear from the network.

The other approach that we're using is what we called manage care for bits and bytes. We're very much focusing on preservation of individual bits, and we do expect this archive to migrate formats. And as operating systems and applications and data formats change, it involves spinning disks, it involves near-line tape, it involves offline tape. And ultimately, it will involve some copies at other locations around the world.

FLATOW: Let's go to…

Mr. KELLER: It's very capacious.

FLATOW: Interesting. Let's go to Ahmed(ph) in Chicago. Hi. Welcome to SCIENCE FRIDAY.

AHMED (Caller): Oh, yeah. Hi, Ira. Thanks to you and your guests. I just wanted to mention that I would urge the guests to focus on the countries that are vulnerable to war and looting, like, for example in Iraq, where the invasion, you know - the libraries were burned and books are still vulnerable in Iraq.

In Yemen, there's some conflict with the Zaidi-Shia community are saying that the government is burning their books. And I hope we don't attack Iran. But if we do, then, I mean, the books there will be vulnerable. So I think more important than discovering books that are hard to access, is discovering books that will otherwise be impossible to access because they would be destroyed. So I just wanted to urge that, and I thank you all for what you're all doing.

FLATOW: Thanks for the call.

Mr. KELLER: That's an excellent point. This is Mike Keller. And I - let me tell the caller that Google and Stanford and the Library of Alexandria are working on a project to digitize Arabic language books. It's a difficult problem set because of the problems with OCRing the various styles of Arabic fonts and scripts. But we are working on that, and I very much agree with his, with his theme.

FLATOW: Brewster, how are you going on digitizing all the Web pages on the Internet?

Mr. KAHLE: So we've used the same technology as the search engines do, of going and contacting Web sites and downloading a page and then finding the links that those pages linked to and following, following and following those, until we get a snapshot of the World Wide Web. We've been doing this for 10 years, every two months. So a full public snapshot of every publicly available Web site and all the pages on them.

And it's getting, actually, very, very large. When we started in 1996, it was about 30 million pages. And now it's over 4 billion pages in each snapshot. The amount of information comes in petabytes, which is the next thing after terabytes. And we have made a Web site available called the Wayback Machine on so that end users - anybody - can go and browse the Web as it was, surf the Web as it was.

And this has been very, very popular, much more popular than we thought, and it's mostly people going back and seeing their old stuff that they didn't actually keep their own backup of.

FLATOW: Do you have any copyright issues with this?

Mr. KAHLE: Well, all of the works - or almost all the works on the World Wide Web are copyrighted. And so we are making copies of these works and make them for the library of the publicly accessible Web sites and making them available again. The way that we view the issue is very much like a library of physical library. We're just putting them on the shelves. We're not representing that they're any different than they ever were.

And if people don't want their things in the archive, then they just ask us to take it out, or they put a robot exclusion on their Web site and we will retroactively remove them from the Wayback Machine. And this has worked out for, basically, almost everybody. And it seems to be a balance that works. We're respectful. We represent them not as our own, but as where they came from and when they came from. And it seems to be working.

FLATOW: Talking…

Mr. HART: So, Ira…

FLATOW: Well, let me - I have to pay a bill here for a second. Stay with us, we're talking about SCIENCE FRIDAY. We're talking about archiving on SCIENCE FRIDAY, from TALK OF THE NATION: SCIENCE FRIDAY from NPR News. Talking about archiving everything. Michael, do you want to jump in there?

Mr. HART: Yeah, so, I…

Mr. KELLER: I assume you mean me. No? This is Mike Keller.

FLATOW: Well, go, Mike Keller.

Mr. KELLER: You never asked me anything.

(Soundbite of laughter)

Mr. KELLER: You asked me one question. It was the first one.

FLATOW: You are free to jump in anytime. You've got to be, you know, these are tough guys out here.

Mr. KELLER: Oh, I know, Brewster. So let me draw in another set of players to this conversation. Brewster referred to the Internet archives practice of collecting the publicly accessible Web and providing access to it. I think that's a terrific service and a great project.

However, the publicly accessible citation is of great interest, because the publicly accessible part of the Web is really only something like 10 percent of everything that's available. We have to deal with and recognize the rights and interests of the creators and the publishers who put out information, having massaged it and marketed and generated it in some cases.

And despite the fact that the current way that the copyright laws in America are written, that this material is going to be locked up for a very long time, something that I agree entirely with both Michael Hart and Brewster Kahle about, we have to find a way to bring in these other players. And the Google Book Search project is precisely trying to do that in order to provide deep access to works in which there are still copyright interests active.

It's good to be able to see 10 percent. It's good to be able to see the public domain works. It's wildly important to be able to index and understand what's in works that are available for some fee or another, or available through a library in order to advance knowledge and advance study and teaching, and to serve small schools and small libraries and individuals in distant places that are distant from libraries.

FLATOW: Oh, Michael Hart, you got the last word.

Mr. HART: Okay. I'm going to sneak in here for 30 seconds.

FLATOW: Yes. Go ahead.

Mr. HART: In terms of the numbers of copies and keeping everything backed up, if you'd do a search for Project Gutenberg in Hamlet, you'll find over 10,000 copies of Hamlet out there. That's the best backup you can have. No matter how many gateways are down, you're still going to find it. Now, in terms of this whole thing about copyright being locked away forever, I'm afraid Michael Keller was all too right about that.

Every time a copyright is going to expire, especially for the Mouse, copyright becomes extended all of a sudden. So they tell us, if you leave this alone for 26 years, 28 years, 14 years, whatever the period was at the time, we'll let you have it after that period. But that period never comes. And I'd see we're out of time.

FLATOW: But Michael Keller was saying that there has to be - you have to find a way of dealing with this, and he thinks the Google people are doing it. How do you react to that, Michael?

Mr. HART: I'm not sure that there is a way to deal with a wolf that's got a hold of your foot. The copyright gets longer and longer. And unless something seriously changes, the trend of the last 300 years - you started with a 14-year copyright, is going to end up with a permanent copyright. Nothing will - I'm not quite sure that any copyright will ever expire in the United States ever again.

FLATOW: Michael Keller?

Mr. KELLER: I think there is that prospect. But at the same time, I know that the publishers and the Congress would like to do the right thing. And I would love to see us harmonize our copyright laws with our present patent laws. I think that the copyright industry would continue to make its money, and I think there are some possibilities. But at the same time, I think we have to expect that as a social good, all those who create content should allow that content to be thoroughly indexed.

And the active indexing of that content should not be a criminal act. Making information accessible - that's not making the pages necessarily accessible for free, but making the information about those pages and from those pages free, freely accessible, is an important public good.

FLATOW: All right. We've run out of time. I'd like to get back to this. This is a very interesting and developing issue. And we run out of time. I like to thank all of my guests, Brewster Kahle, digital librarian and director and co-founder of the Internet Archive; Michael Hart, founder of Project Gutenberg; and Michael Keller, university librarian, director of academic information resources at Stanford University in Stanford. Thank you gentleman for taking time to be with us today.

Mr. KELLER: Thank you very much, Ira.

Mr. KAHN: Thank you.

(Soundbite of music)

FLATOW: I'm Ira Flatow in New York.

Copyright © 2007 NPR. All rights reserved. Visit our website terms of use and permissions pages at for further information.

NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.