DNA storage could preserve data for millions of years

DNA storage could preserve dat...
In the search for ways to store data permanently, ETH researchers have been inspired by fossils (Photo: Philipp Stössel/ETH Zurich)
In the search for ways to store data permanently, ETH researchers have been inspired by fossils (Photo: Philipp Stössel/ETH Zurich)
View 1 Image
In the search for ways to store data permanently, ETH researchers have been inspired by fossils (Photo: Philipp Stössel/ETH Zurich)
In the search for ways to store data permanently, ETH researchers have been inspired by fossils (Photo: Philipp Stössel/ETH Zurich)

Taking inspiration from the way fossilized bones can preserve genetic material for hundreds of thousands of years, researchers at ETH Zurich have developed a "synthetic fossil" by writing digital information on DNA and then encapsulating it in a protective layer of glass.

Most of our digital data is stored with technology that is designed to work in the short term, but which can’t really stand the test of time. Standard hard disk drives won’t last more than a few decades and are subject to damage from high temperatures, moisture, magnetic fields and mechanical failures. Even solid state drives, which perform better and are less susceptible to mechanical issues, will lose their data if they go unpowered for more than a few months.

One interesting solution could be to store digital data using strands of DNA. As far-fetched as this may sound, there are a couple of very good reasons that make this an attractive proposition. Firstly, DNA can store information with a data density so high that it can be hard to fathom: a single living cell can contain millions of nucleobases and each can represent at least one bit of information, for a data density approaching one petabyte (million gigabytes) per cubic millimeter. Add to this the fact that under the right conditions fossils can preserve genetic material for millions of years, and you have the perfect candidate for long-term data storage. This is exactly what Dr. Robert Grass and team at ETH Zurich are trying to achieve.

As you’ll remember from high school biology, DNA is encoded by four nucleobases, meaning that each of them can, in theory, represent up to two bits of information. After limitations dictated by the technical challenges of synthesizing and sequencing nucleotides, and with the addition of redundant bits (which make up 35 percent of total data) to protect against data corruption, the final rate is an impressive 1.2 bits of useful data for each nucleotide.

Dr. Grass and team began their experiment by storing 83 kilobytes of information (Switzerland’s Federal Charter of 1291 and Archimede’s The Methods of Mechanical Theorems) inside 4,991 DNA segments, each 158 nucleotides long. Then, to protect the DNA from degenerating over time, the researchers created a de facto "synthetic fossil" by encapsulating it in 150-nanometer silica spheres, which prevent the genetic material from chemically reacting with the environment. To read the data back, the nanospheres need to be exposed to a fluoride solution which dissolves the silica but leaves the DNA intact.

Digital systems designed to store data for the very long term (from high-density crystals to rugged tungsten discs) usually aim for very high levels of heat resistance. The reason for this is that the generally accepted way to estimate long-term durability and data retention in the lab is to subject the storage medium to high levels of heat. Encapsulating DNA in silica (glass) is specifically meant to provide that level of protection.

In this case, the researchers simulated the degradation of the DNA by exposing it to temperatures between 60 and 70 degrees Celsius (140–160 °F) for up to a month, which replicated the chemical degradation that would have taken place over thousands of years.

Current technology incurs a lot of mistakes while both writing and reading data from DNA, but the redundant bits written alongside the original data showed their use here.

"After storing the DNA for a simulated 10,000 years in the fridge at 4 °C [40 °F], about 80 percent of the sequences contain at least one error and about 8 percent of the sequences are completely lost," Grass told Gizmag. "Still, due to the smart redundancy we have added by the Reed-Solomon coding, we are able to decode the data without final error."

The scientists calculated that if the same data had been stored at even lower temperatures, such as at the -18 °C (0 F) found inside the Svalbard Global Seed Vault, it would have survived for over a million years.

Although the cost of manipulating DNA makes this currently unpractical for everyday use, advances in DNA sequencing are dropping the cost of reading stored data, and more research is also going into reducing the cost of writing digital information onto genetic material.

"We are currently looking into decreasing the cost of writing information into DNA (currently at scientific level at 500 USD/MB) and into first commercial applications in storing highly valuable information," said Grass. "We’re also gathering more data on the thermal stability to gain a more precise understanding on how DNA decays chemically and how this can be further avoided."

The research is described in the latest issue of the journal Angewandte Chemie

.Source: ETH Zurich

1 comment
1 comment
Vincent Singleton
Nice but will they have a way to read the data after millions of years?