How DNA could provide data storage for more than our genetics
MANOUSH ZOMORODI, HOST:
Today on the show, preserving our past, present and future for all eternity, which begs the question, where are we going to keep all that information? And how will we keep it safe?
DINA ZIELINSKI: Absolutely. One of the biggest challenges is the sheer amount of data that we're generating. And I don't think we really realize, most of us, where all of that data is being stored. OK, maybe I'll pay a few more bucks a month so that I can store my data in the cloud. But what is that cloud, you know? It's this huge server facility.
ZOMORODI: This is molecular biologist Dina Zielinski.
ZIELINSKI: So it's projected that by the year 2025, the global datasphere, as in, like, all of the digital data that is out there, is projected to exceed more than 175 zettabytes. And one zettabyte is equal to a trillion gigabytes. So it's really become an urgent problem. You know, we've come a long way in digital data storage and computing. But we're running out of storage devices quite literally.
ZOMORODI: Dina says there is a possible solution to our data storage problem, a microscopic solution that's been around for billions of years - DNA.
ZIELINSKI: I actually have some in my fridge here. You can...
ZIELINSKI: ...Store - yeah, you know? That was a collaboration I had with a local artist. But we started a digital museum on DNA. And so it's currently in my refrigerator.
ZOMORODI: Wait. So you have a museum on DNA in your refrigerator right now.
ZOMORODI: I love the image of you, like, going to get a, like, scoop of strawberry ice cream and seeing this little vial of DNA holding all this data right next to it.
ZIELINSKI: Yeah. I forget it's there sometimes, and then I move...
ZIELINSKI: ...My mustard jar over, and there it is.
ZOMORODI: So great.
ZIELINSKI: The idea of DNA data storage is actually not new. And we can actually say that DNA is the original storage device. It stores all of the data that makes up living beings. And so it's been optimized over literally billions of years.
ZOMORODI: Dina says you can use DNA just like a hard drive or a floppy disk, but it does an even better job.
ZIELINSKI: It's true. I mean, it's absolutely true. It can beat any man-made device in terms of longevity, density. It has a very small ecological footprint. I mean, it's biocompatible. It's something that we'll always be interested in. And we can think of DNA as just another storage device. So instead of a compact disc or a floppy disk or a hard drive, we simply use the molecule of DNA.
ZOMORODI: Yeah. And when it comes to DNA's durability, I have heard you use the example of Otzi the iceman. This is a 5,000-year-old man whose body was found frozen by hikers in the Alps in the '90s.
ZIELINSKI: Oh, yes, Otzi the iceman. So he was up in the Alps, and it was nice and cold and dry. And these are sort of the ideal storage conditions. And all that is to show that, you know, even after thousands of years, we're still able to extract meaningful information from the DNA. So DNA in itself is very fragile, actually. If I just have day in a tube, for example, it's very susceptible to UV radiation, and it is really only stable at room temperature for a relatively short period of time. And then it starts to sort of degrade and accumulate errors. But if you store it very cold and very dry, we could theoretically store data that's critical to humanity in such a place, in a naturally existing cold and dry place, like Otzi the ice man.
ZOMORODI: Here's Dina Zielinski on the TED stage.
(SOUNDBITE OF TED TALK)
ZIELINSKI: Storing data on DNA is not new. Nature's been doing it for several billion years. In fact, every living thing is a DNA storage device. But how do we store data on DNA? We first learned to sequence or read DNA and, very soon after, how to write it or synthesize it. This is much like how we learn a new language. And now we have the ability to read, write and copy DNA. We do it in the lab all the time. So anything - really, anything that can be stored as zeros and ones can be stored in DNA. To store something digitally, we convert it to bits or binary digits. Each pixel in a black-and-white photo is simply a zero or a one. And we can write DNA much like an inkjet printer can print letters on a page. We just have to convert our data - all of those zeros and ones - to As, Ts, Cs and Gs. And then we send this to a synthesis company. So we write it. We can store it. And when we want to recover our data, we just sequence it.
ZOMORODI: OK, so is there a machine that does this, that says, right, there's a zero and a one, and so I'm going to turn that into an A or - and a C or whatever else it would be and transpose it or write it onto this synthetic material?
ZIELINSKI: Right. So that is actually the easiest part of this whole process. So we can theoretically store up to two bits, or two zeros or ones, of data per DNA letter, right? So you have a possibility of a zero or one stored in A, T, C or G. And so the encoding scheme is actually very simple. So if you have a 00, that becomes an A, 01 becomes C, 10 becomes G, and 11 becomes T. And that's it. It's just a text file, right? So you have a text file at the end of this whole process with just sentences of As, Ts, Cs, and Gs, and we send that text file to a synthesis company. And in a matter of days, depending on the complexity of your DNA sequences, they'll send you back a tube with a sort of dried powder containing your DNA.
ZOMORODI: With a dried powder.
ZIELINSKI: Yes. So the DNA is quite stable if you remove the moisture from it. And so usually it comes in a tube with just sort of this thin film on the bottom.
ZOMORODI: OK, so you've got the tube with the dried powder. How long can you leave it in there for?
ZIELINSKI: So if you store it, for example, in your refrigerator, in your kitchen, I mean, it can easily last years, even tens of years. Ideally, you would freeze it and that would further extend the life of the DNA.
ZOMORODI: Even for thousands of years, like Otzi?
ZIELINSKI: Yes. So, I mean, when we synthesize DNA, it's still fragile on its own. But if you keep it protected - so we can store DNA in silica beads - so glass beads - and some researchers at ETH in Zurich, Robert Grass' team, have shown that the DNA is stable for thousands of years. And they've replicated sort of a European climate. And they showed that by inducing temperature and other stress, they were still able to recover the data from the DNA.
ZOMORODI: And let's say you were, like, today's the day. I'm going to get the data out of the DNA. How would you go about doing that?
ZIELINSKI: So just add water. So you take your...
ZIELINSKI: ...You just take your tube of DNA, you add water to it, and then you load it onto the sequencer and then you simply convert your As, Ts, Cs and Gs back to zeros and ones and decode the data.
ZOMORODI: OK. So this is something that you need highly specialized computers and machines to do right now, right?
ZIELINSKI: Yes. And even though I have DNA in my fridge, I cannot do much with it.
ZOMORODI: OK. Got it. So just being at home, it's not like you're going to store your data in the fridge like you store data on your desktop, on your laptop, or something. What we're talking about is really putting critical pieces of humanity's collective knowledge in a safe place.
ZIELINSKI: Yes. So DNA storage is mostly useful for archival or cold data. One of my favorite examples is here in France, they stored the Rights of Man in DNA. And they actually keep it in the French National Archives here.
ZIELINSKI: And you can make copies very easily and very cheaply, but it is currently stored safely in a vacuum tube in the archives here.
ZOMORODI: So why aren't more people doing this now?
ZIELINSKI: I would say in the next decade or so, the synthesis costs will drop down sufficiently in order to make this a scalable and affordable alternative. But for the time being, it's simply too expensive. But if we can store critical archival data, that's really the ideal solution for DNA, at least in the next, like, decade or two.
ZOMORODI: How expensive is it right now?
ZIELINSKI: I mean, it's insanely expensive. It's kind of a moving target. But it - we're talking on the order of, you know, thousands to millions or billions of dollars just to store even text documents. To give you a concrete example - so in our study back in 2017, we stored two megabytes of data. So we basically compressed a few files, including an Amazon gift card, one of the first movies ever made and an operating system. And so all that totaled to two megabytes. And it cost about $7,000 just to synthesize that. So it's still way too expensive.
But this is where scientists are, once again, turning to nature. So instead of using this very costly chemical synthesis process, there are actually enzymes that exist in nature that synthesize DNA. And so there are quite a few groups and companies that are specifically working on improving that method of synthesis.
ZOMORODI: It's interesting, though, thinking about the scenario where we would need to retrieve the Rights of Man from the French archive. I mean, if we are at a point where that's the only copy we have left, the one that's on the DNA, don't we have far bigger problems as the human race? I can't help but go to a very sort of nihilist point in these apocalyptic scenarios, where I'm, like, who cares, right?
ZIELINSKI: Yeah. Especially after the pandemic, with global warming, I think, yeah, we're kind of preparing for the worst here.
ZOMORODI: But do you feel like - is that sort of a defeatist - like, are you thinking about these scenarios all the time as someone who's considering how we might save data for all eternity?
ZIELINSKI: I try not to, but inevitably, yes, I do, especially just when it comes to thinking about all of those documents that are critical to humanity. But I think it's something that we all need to come together on, on a global level, at least to some extent, to decide what is important and how and where we might back these documents up.
(SOUNDBITE OF MUSIC)
ZIELINSKI: The cool thing is, it sounds like science fiction. It sounds crazy and complicated, and it really isn't. It's actually very straightforward. It's an elegant, ideal solution with shortcomings that are being addressed, you know, by scientists. So I'm very optimistic that it will become a viable solution, at least, as I mentioned, for critical, archival data within the next decade or so.
ZOMORODI: Dina Zielinski is a molecular biologist and a bioinformatician. You can see her full talk at ted.com. On the show today, for all eternity. I'm Manoush Zomorodi, and you're listening to the TED Radio Hour from NPR. We'll be right back.
NPR transcripts are created on a rush deadline by an NPR contractor. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.