Computers

Harvard system could store data in organic molecules for millennia

Harvard researchers have developed a new system for reading and writing data to organic molecules
Harvard researchers have developed a new system for reading and writing data to organic molecules
View 2 Images
A diagram demonstrating how the new system works
1/2
A diagram demonstrating how the new system works
Harvard researchers have developed a new system for reading and writing data to organic molecules
2/2
Harvard researchers have developed a new system for reading and writing data to organic molecules

They don't call this the Information Age for nothing – nowadays we can access the entirety of humanity's collective knowledge from small computers in our pockets. But all that data has to be stored somewhere, and huge servers take up heaps of physical space and require huge amounts of energy. Now, researchers at Harvard have developed a new system for reading and writing information with organic molecules, which could potentially sit stable and secure for thousands of years.

DNA is the medium of choice for information storage in the natural world, for good reason – it can store huge amounts of data in a tiny space, and is extremely stable, surviving for millennia under the right conditions. Recent studies have explored this possibility, cramming DNA data onto the tips of pencils, into cans of spray paint, and even encoded into living bacteria.

But DNA has its own hurdles. As far as molecules go it's relatively large, and reading and writing it can be a fiddly and time-intensive process.

"We set out to explore a strategy that does not borrow directly from biology," says Brian Cafferty, first author of the new study. "We instead relied on techniques common in organic and analytical chemistry, and developed an approach that uses small, low molecular weight molecules to encode information."

Instead of DNA, the researchers used oligopeptides, small molecules made up of varying numbers of amino acids. The base of the process is a microplate, a metal plate containing 384 tiny wells. Different combinations of oligopeptides are placed into each well to represent one byte of information.

It's built on the binary system: if a particular oligopeptide is present, it reads as a 1, and if it's absent, that's a 0. Using that, the code in each well can represent a single letter, or one pixel of an image.

The key to recognizing which oligopeptides are present and which aren't is their mass, which can be read using a mass spectrometer. Ultimately, that's how the information can be retrieved again.

A diagram demonstrating how the new system works
A diagram demonstrating how the new system works

In their tests, the researchers managed to write, store and read back 400 kB of data, including a written transcript of a lecture, a photo and a painting. According to the team, the average writing speed is eight bits per second and reading takes 20 bits per second, with an accuracy of 99.9 percent.

The team says there are several advantages to the new system. Oligopeptides can be stable for hundreds or thousands of years, which would make them ideal for long-term archival data storage. It can also cram more data into smaller physical space, potentially even smaller than that of DNA. The team says that the entire contents of the New York Public Library, for example, could be stored in a teaspoon full of protein.

The system can work with a wide range of molecules as well. It can also write faster than DNA is capable of, although the researchers do admit that it can be a little slow to read. Either way, both of these could be improved in future with better technology, like using inkjet printers to write data and better mass spectrometers to read it.

The research was published in the journal ACS Central Science.

Source: Harvard University

2 comments
usugo
the stating that "Oligopeptides can be stable for hundreds or thousands of years" is trivially flawed. There is a reason why in nature information is stored in DNA, and not in proteins and oligopeptides
noteugene
I think your reasoning is flawed. Not only that, you don't state what these reasons are. Neither do you explain the longetivity in which you dispute, just say it is so.