Skip to content

Data Storage Is Becoming A Massive Problem, DNA May Be The Answer

2018 October 17

Data, as many have noted, has become the new oil, meaning that we no longer regard the information we store as merely a cost of doing business, but a valuable asset and a potential source of competitive advantage. It has become the fuel that powers advanced technologies such as machine learning.

A problem that’s emerging, however, is that our ability to produce data is outstripping our ability to store it. In fact, an article in the journal Nature predicts that by 2040, data storage would consume 10–100 times the expected supply of microchip-grade silicon, using current technology. Clearly, we need a data storage breakthrough.

One potential solution is DNA, which is a million times more information dense than today’s flash drives. It also is more stable, more secure and uses minimal energy. The problem is that it is currently prohibitively expensive. However, a startup that has emerged out of MIT, called Catalog, may have found the breakthrough we’re looking for: low-cost DNA Storage.

The Makings Of A Scientist-Entrepreneur

Growing up in his native Korea, Hyunjun Park never planned on a career in business, much less the technology business, but expected to become a biologist. He graduated with honors from Seoul National University and then went on to earn a PhD from the University of Wisconsin. Later he joined Tim Lu’s lab at MIT, which specializes in synthetic biology.

In an earlier time, he would have followed an established career path, from PhD to post-doc to assistant professor to tenure. These days, however, there is a growing trend for graduate students to get an entrepreneurial education in parallel with the traditional scientific curriculum. Park, for example, participated in both the Wisconsin Entrepreneurial Bootcamp and Start MIT.

He also met a kindred spirit in Nate Roquet, a PhD candidate who, about to finish his thesis, had started thinking about what to do next. Inspired by a talk from given by the Chief Science Officer at a seed fund, IndieBio, the two began to talk in earnest about starting a company together based on their work in synthetic biology.

As they batted around ideas, the subject of DNA storage came up. By this time, the advantages of the technology were well known but it was not considered practical, costing hundreds of thousands of dollars to store just a few hundred megabytes of data. However, the two did some back-of -the-envelope calculations and became convinced they could do it far more cheaply.

Moving From Idea To Product

The basic concept of DNA storage is simple. Essentially, you just encode the ones and zeros of digital code into the T, G, A and C’s of genetic code. However, stringing those genetic molecules together is tedious and expensive. The idea that Park and Roquet came up with was to use enzymes to alter strands of DNA, rather than building them up piece by piece.

Contrary to popular opinion, most traditional venture capital firms, such as those that populate Sand Hill Road in Silicon Valley, don’t invest in ideas. They invest in products. IndieBio, however, isn’t your typical investor. They give only give a small amount of seed capital, but offer other services, such as wet labs, entrepreneurial training and scientific mentorship. Park and Roquet reached out to them and found some interest.

“We invest in problems, not necessarily solutions,” Arvind Gupta, Founder at IndieBio told me. “Here the problem is massive. How do you keep the world’s knowledge safe? We know DNA can last thousands of years and can be replicated very inexpensively. That’s a really big deal and Hyunjun and Nate’s approach was incredibly exciting.”

Once the pair entered IndieBio’s four-month program, they found both promise and disappointment. Their approach could dramatically reduce the cost of storing information in DNA, but not nearly quickly enough to build a commercially viable product. They would need to pivot if they were going to turn their idea into an actual business.

Scaling To Market

One flaw in Catalog’s approach was that the process was too complex to scale. Yet they found that by starting with just a few different DNA strands and attaching them together, much like a printing press pre-arranges words in a book, they could come up with something that was not only scalable, but commercially viable from a cost perspective.

The second problem was more thorny. Working with enzymes is incredibly labor intensive and, being biologists, Park and Roquet didn’t have the mechanical engineering expertise to make their process feasible. Fortunately, an advisor, Darren Link, connected the pair to Cambridge Consultants, an innovation consultancy that could help them.

“We started looking at the problem and it seemed that, on paper at least, we could make it work,” Richard Hammond, Technology Director and Head of Synthetic Biology at Cambridge Consultants, told me. “Now we’re about halfway through making the first prototype and we believe we can make it work and scale it significantly. We’re increasingly confident that we can solve the core technical challenges.”

The first prototype machine, which is on schedule to be completed by early 2019, will be able to encode a full terabit per day and by 2022, the company expects to be able to increase that to a petabit — a 1000-fold improvement. That should make the process competitive for archival storage, such as medical and legal records as well as storing film databases at movie studios.

“I think the fact that we’re inventing a completely new medium for data storage is really exciting,” Park told me. “I don’t think that we know yet what the true potential is because the biggest use cases probably don’t exist yet. What I do know is that our demand for data storage will soon outstrip our supply and we are thrilled about the possibility of solving that problem.”

Going Beyond Digital

A generation ago, the task of improving data storage would have been seen as solely a computer science problem. Yet today, the digital era is ending and we’re going to have to look further and wider for solutions to the problems we face. With the vast improvement in genomics, which is far outpacing Moore’s law these days, we can expect biology to increasingly play a role.

“Traditional, information technology has been strictly the realm of electrical engineers, physicists and coders,” Gupta of IndieBio told me. “What we’re increasingly finding is that biology, which has been honed for millions of years by evolution, can often point the way to solutions that are more robust and potentially, much cheaper and more efficient.”

Yet this phenomenon goes far beyond biology. We’re also seeing similar accelerations in other fields, such as materials science and space-related technologies. We’re also seeing a new breed of investors, like IndieBio, that focus specifically on scientist entrepreneurs. “I consider myself a product of the growing ecosystem for scientific entrepreneurs at universities and in the investor community,” Park told me.

Make no mistake. We are entering a new era of innovation and the traditional Silicon Valley approach will not get us where we need to go. Instead, we need to forge greater collaboration between the scientific community, the investor community and government agencies to solve problems that are increasingly complex and interdisciplinary.

– Greg

An earlier version of this article first appeared in Inc.com

Image: Pixabay

No comments yet

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS