Follow Datanami:
February 14, 2022

Harvard’s New Data Storage Is to Dye For, Avoids DNA Storage Pitfalls

The explosion in data collection has led to challenges in storing enormous amounts of data. This is particularly true for archival data, with many popular methods of data storage—such as optical disks—having relatively short lifespans in the grand scheme of things. Researchers are exploring myriad ways to resolve this problem, ranging from DNA-based data storage to Microsoft’s quartz-based Project Silica. Now, a team of Harvard researchers are introducing a new contender for long-term data storage: dye.

How It Works

Essentially, the method works as follows: a specialized inkjet is used to deposit a mixture of differently colored, commercially available dyes onto an epoxy base. These dye colors and combinations are thus able to serve as code for characters, where each dye’s presence constitutes a “1” (as opposed to its absence, a “0”). The deposited dyes can then be read by a fluorescence microscope. 

An illustration of how the presence and absence of different dyes can be used to encode digital information. Image courtesy of the researchers.

The dye-based storage is a form of molecular storage (like DNA storage), which offers stability over thousands of years and exceptional information density without any associated power draw. But unlike DNA storage or similar molecular storage methods, this dye-based data storage does not require any complicated molecular synthesis to encode—and does not require any complicated sequencing to decode. 

Of course, the dye-based storage is much denser than, say, depositing drops of dye with an eyedropper. The researchers were able to write around 14KB of information on a 7.2mm square area—a density of 271.5 bytes per square millimeter on an area a little smaller than a pea. The researchers were able to write that information at a rate of 58KB per second and, perhaps more importantly, they were able to read it quickly, as well. Moreover, this reading of the data was performed over 1,000 times without a significant loss in the intensity of the signal.

“This approach enables information storage with high density, fast read/write speeds, and multiple reads of a single set of molecules without loss of information, all at an acceptable cost,” the researchers wrote.

“The beauty of it is its simplicity,” said Robert Grass, a chemical engineer at ETH Zürich, in an interview with Chemical & Engineering News. “Our world needs a lot of data. It is important that we keep searching for new technologies with unique data-carrying abilities, as there is no one-size-fits-all solution to data storage.”

The researchers further developed this technology to store non-ASCII data, successfully converting a 3KB .jpg image of Michael Faraday into a string and printing that string via dye. However, the researchers said, “the quality of recovered data is much more sensitive to errors than when it is in a lossless image encoding format.” 

The researchers are also commercializing this technology through a startup called Datacule, which, according to the Harvard Crimson, is working on developing an end-to-end prototype capable of both printing and reading dye-encoded data. 

“We’ve passed the first hurdle, which is developing a technology that works — and there’s no question it works, that it has certain advantages,” Whitesides said in an interview with the Crimson. “The second hurdle is, does anybody care? We still have to answer that, and the company will do that.”

What’s New with DNA Data Storage

DNA data storage, of course, has a much longer history, spanning back many decades of research and (so far, unsuccessful) attempts to scale it for commercialization. It’s been a particularly busy few years for the technology, though: in 2020, researchers at the University of Texas at Austin encoded a book in DNA and recovered it successfully despite the errors common to DNA storage; last April, Los Alamos National Laboratory developed a binary-to-DNA translator; and just a few months ago, a team at the Georgia Tech Research Institute announced the development of a microchip that could quickly and cheaply grow DNA strands for high-density data storage.

The DNA Data Storage Alliance, meanwhile, has been working since 2020 to advance the field, working with powerful members like Microsoft, Western Digital, Illumina, and Twist Bioscience to advance DNA-based storage. Just last month, the alliance admitted a new member: eureKARE, an investment company focused on next-generation biotech companies in synthetic biology and microbiome sciences.

“It is clear to us that the storing of digital data is a major challenge for our generation and one that we hope to address by investing in DNA data storage approaches,” said Kristin Thompson, chief business officer of eureKARE, when the company joined the alliance. “DNA is a wonderful, eco-friendly solution to this problem due to its extremely dense nature. The market demand for a sustainable, low-cost approach, such as DNA data storage is anticipated to grow exponentially in the next few years and this technology truly has the ability to revolutionize our lives.”

Related Items

The Next Breakthrough in Long-Term Data Storage Is… Gold?

Three Surprising Ways Archiving Data Can Save Serious Money

The State of Storage: Cloud, IoT, and Data Center Trends