Research Overcomes Key Obstacles to Scaling Up DNA Data Storage


Image credit: DataBase Center for Life Science. Shared under an Innovative Commons license.

Scientists from North Carolina State University have actually established brand-new methods for labeling and recovering data files in DNA-based details storage systems, attending to 2 of the key obstacles to prevalent adoption of DNA data storage innovations.

DNA systems are attractive because of their potential information storage density; they could theoretically store a billion times the amount of data stored in a conventional electronic device of comparable size,” states James Tuck, co-corresponding author of a paper on the work and an associate teacher of electrical and computer system engineering at NC State.

“But two of the big challenges here are, how do you identify the strands of DNA that contain the file you are looking for? And once you identify those strands, how do you remove them so that they can be read – and do so without destroying the strands?”

“Previous work had come up with a system that appends short, 20-monomer long sequences of DNA called primer-binding sequences to the ends of DNA strands that are storing information,” states Albert Keung, co-corresponding author of the paper and an assistant teacher of chemical and biomolecular engineering at NC State. “You could use a small DNA primer that matches the corresponding primer-binding sequence to identify the appropriate strands that comprise your desired file. However, there are only an estimated 30,000 of these binding sequences available, which is insufficient for practical use. We wanted to find a way to overcome this limitation.”

To resolve these issues, the scientists established 2 methods that, taken together, they call DNA Enrichment and Embedded Separation, or DENSe.

The scientists took on the file recognition obstacle by utilizing 2, embedded primer-binding series. The system very first recognizes all of the hairs consisting of the preliminary binder series. It then carries out a 2nd “search” of that subset of hairs to single out those hairs which contain the 2nd binder series.

“This increases the number of estimated file names from approximately 30,000 to approximately 900 million,” Tuck states.

As soon as recognized, the file still requires to be drawn out. Existing methods utilize polymerase domino effect (PCR) to make lots (and lots) of copies of the pertinent DNA hairs, then series the whole sample. Since there are numerous copies of the targeted DNA hairs, their signal overwhelms the remainder of the hairs in the sample, making it possible to recognize the targeted DNA series and check out the file.

“That technique is not efficient, and it doesn’t work if you are trying to retrieve data from a high-capacity database – there’s just too much other DNA in the system,” states Kyle Tomek, a Ph.D. trainee at NC State and co-lead author of the paper.

So the scientists took a various method to data retrieval, connecting any of numerous little molecular tags to the guides being utilized to recognize targeted DNA hairs. When the guide discovers the targeted DNA, it utilizes PCR to make a copy of the pertinent DNA – and the copy is connected to the molecular tag.

The scientists likewise used magnetic microbeads covered with particles that bind particularly to an offered tag. These functionalized microbeads “grab” the tags of targeted DNA hairs. The microbeads can then be obtained with a magnet, bringing the targeted DNA with them.

“This system allows us to retrieve the DNA strands associated with a specific file without having to make many copies of each strand, while also preserving the original DNA strands in the database,” Keung states.

“We’ve implemented the DENSe system experimentally using sample files, and have demonstrated that it can be used to store and retrieve text and image files,” Keung includes.

“These techniques, when used in tandem, open the door to developing DNA-based data storage systems with modern capacities and file-access capabilities,” Tomek states.

“Next steps include scaling this up and testing the DENSe approach with larger databases,” Tuck states. “A big challenge there is cost.”

The paper, “Driving the Scalability of DNA-Based Information Storage Systems,” is released in the journal ACS Synthetic Biology. Co-lead author of the paper is Kevin Volkel, a Ph.D. trainee at NC State. The paper was co-authored by Alexander Simpson, a previous college student at NC State; and Austin Hass and Elaine Indermaur, both undergrads at NC State.

The work was made with assistance from the National Science Structure under grant number 1650148.

Recommended For You

About the Author: livescience

Leave a Reply

Your email address will not be published. Required fields are marked *