Charting The Blockchain of DNA


David Koepsell has been examining the future of genomic data for nearly a decade. An academic philosopher, entrepreneur, and retired attorney, Koepsell keeps his fingers in the world of science, technology, ethics, and public policy.

He is also the author of Who Owns You, now in its second edition, a widely acclaimed look at the philosophical and legal problems associated with patenting human genes.

Feature Interview With David Koepsell, CEO of EncrypGen

Koepsell and his long-time collaborator and partner, Dr. Vanessa Gonzalez — launched a blockchain-centric startup called EncrypGen, which provides next-generation software for genomic data. In the interview below, Koepsell elaborates on the exciting advancements taking place in the genomic world and how blockchain will likely play a part in it.

David Koepsell
David Koepsell 

Tell us a little about your path from philosophy into the genomics world and how you discovered blockchain?

I was actually a lawyer, finishing my Ph.D. in Philosophy while practicing law. The combination of philosophy and law got me engaged and interested in two main subfields: ontology and ethics. I also became interested in patents and copyrights, and when I met my wife, who is a genomic scientist, I started reading about genomics. We were interested in solving issues relating to privacy and genetic data, as well as making it more available for science. This led us to the idea of combining genomics with blockchain tech. I found seed money and founded EncrypGen, and hired a developer to get a prototype made. Two years later, we launched the Gene-Chain.

How do you see the worlds of blockchain and genomics intersecting?

Genomics holds great promise in the field of healthcare. The future of medicine, we believe, has tremendous potential for greater efficiencies through targeted treatments, much of which will depend on knowing more about a patient’s genes. Blockchain allows us to do a number of things that are helping this become a reality: share and secure data through encryption, track the use of genetic data, and even subsidize or recompense subjects in studies for their contributions to science while bringing them into the process as curators of their own data.

With respect to this, where do you think Blockchain is in its evolution?

Blockchains have not been sufficiently tapped for their great potential as ledgers in fields other than finance, although many have noted this, and some are trying. We think one obvious first application is in science, where the provenance of data is critical, and in genomics and health, where that provenance and curation by patients could help bring about better health care as well.

Are there other value propositions that blockchain can deliver?

While blockchains are noted for their transparency, they are also a means by which additional privacy and security can be achieved while allowing for transfers of data over networks. Because they create immutable ledgers, and because the “bookkeeping” is kept by way of “hashes” of transactions (alphanumeric records), they can help anonymize users and verify the validity of transactions, as well as provide auditable records of transactions. Given increasing concerns about the privacy of our data, being able to help anonymize users and still retain traceable records can provide greater assurance and confidence in the transacting of business over networks.

What sort of ethical and social issues do we need to remain mindful of amid this body of blockchain innovation?

We should be careful not to be overly enthralled by the technology without also addressing the need for education. Blockchain is a tool that, when properly applied, can give patients and subjects power and maybe even wealth, while lubricating the flow of scientific data. But at the heart of the application we are creating, for instance, is the value of freedom and property. Similar applications could ignore those values, fail to bring the customer in, fail to educate them about their rights and responsibilities, and simply focus on the money. That would probably be the path of least resistance right now for any of our competitors.

How do we better manage the conundrum between efforts to foster data sharing and the need for consumer privacy?

One thing we are learning is that without some way to track the use of our data, we are at the mercy of the companies that gather that data. The transparency that blockchains afford us is the ability to track data use over time (in our case, genomic metadata which will reside on our blockchain) and yet keep that data encrypted. With private keys, we can be in control and aware of the use of our data, and yet make it available in a market, where we can be the curators, custodians, and salespersons instead of some intermediary. The system is trust-less. The ethic of privacy is actually cooked right into the technology, so we need not rely on some company’s goodwill to reveal their failures.

And what can the Equifax data breach of a few years ago teach us about security?

Equifax revealed their vulnerability, how many countless others have not? If we have a permanent, immutable record of transactions of our data, keep the metadata about that on a blockchain, and only trade our data with those who make the appropriate transactions with us, we as everyday citizens become more deeply involved with our own data, and have a better vantage point to judge our preferences about its uses, as well as guard against its misuse. When the technology makes this possible, an ecology and economy of sharing will grow without some of the risks we currently face through the use of intermediaries.

Can you share a bit about the software company you and Dr. Vanessa Gonzalez created to address the issue of genomic data and privacy?

EncrypGen was our response to the urgency of making genomic data more available for scientists and putting individuals in charge of their data and its use. We sought to develop the world’s first genomic blockchain for science and raised seed money to begin to build a real product. That seed money funded a couple of developers and a prototype, and in November 2018 a full version 1.0 of our product launched. We also sold half of the native tokens for the blockchain to customers helping us to ensure that we are liquid for the next year or two while we raise capital, build our team, and sell nodes of the Gene-Chain.

And just to back up a bit, why did you decided to launch this company?

Our decision to start a company was based on the fact that the problem of gene patenting went without fixing for decades because policy was often slow to catch up. We chose to make an artifact over arguments, with the idea that a good, working product capable of actually achieving the policy aims we thought could help protect privacy while enhancing science were brought to fruition quickly. The market will judge our success, not legislative bodies or courts.

You started Gene-Chain in 2017, the world’s first genomic data marketplace. How has that progressed and what are you learning over the course of its evolution?

We have met all our milestones and released the full version of our marketplace on November 6, 2018. Since then, we have been actively seeking and gaining users, as well as seeing traffic in the way of data sales on the platform. We now have 1000 users, the bulk of whom have uploaded their data (from direct to consumer genetic tests), filled out profiles, and many of whom are already earning $DNA for their data. We project, based on the transactions so far, that by the time we have between 30000 and 50000 users we will be profitable.

That sounds like some pretty high aspirations?

We know we can get that many users, and we know how much it will cost, and we expect to be able to do this in the next two years. Aiding us in meeting this goal is the invitation we recently received to take part in a project that will acquire us 100,000 users, basically for free, just by being part of this project over 30 months. We have learned, given the obvious pain points of 1) lack of affordable data for science, and 2) consumers not having been cut into the profit flow created by selling their data, that if we build it (as we have) they will come, (as they are).

And are there any new developments on your front with Microsoft for Startups, the $500 million initiative to help startup create, develop, and market enterprise- level software.

We became a Microsoft Startup recently, which affords us a range of benefits starting with Azure cloud credits. Our only real capital expense is cloud services, so those credits are helpful at this stage. Moreover, it has helped us to focus our product on a new, viable customer-type since MS startups includes even more benefits for a B2B product capable of listing on the MS Store. That product is coming this summer and will be yet another mechanism by which we will be able to hit our goals of user adoption.

Any updates from your partnerships with Murrieta Genomics and Viazoi.

Both our partnership with Viazoi and with Murrieta Genomics are additional means of allowing users to find us as each is cross-branding and encouraging their users to also use the Gene-Chain. We continue to seek to work with any organization that shares the values we have of user-enablement, ownership, and profit, and who wish to encourage their clientele to control and monetize their genetic data using the Gene-Chain. Because we remain the only company doing this with a token that can be converted to other currencies, we are receiving more such requests for partnerships all the time.  

How is your DNA token progressing? What sort of lessons can you share around the tokenization of your project?

The biggest lesson we can tie to tokenization and blockchain is that people like money, and tokens are valued as much as money with little hesitation. We have watched and waited for competitors to mimic our model, and have seen many eschew it over a variety of fears. Perhaps due to regulatory uncertainty, or due to concerns about user adoption, other “blockchain genomics” companies have essentially tossed aside the blockchain part in an effort to create new models for genomic data sharing. We think this is a mistake, and have even offered to let their users use our token to get paid for their data.

So it’s working for you?

Yes. It is working for us, and it works for our customers who are making money they can actually spend from data they already own. That offer remains open. The DNA token is valued, and can represent well the value of the data. In the end, we believe that blockchains without tokens or browseable public explorers aren’t abiding by the best features of blockchains.

What sorts of problems does a “genomic blockchain” solve for this evolving landscape?

It gives us a record, a ledger to track transactions, including data requests and payments. It enables us to reveal data to be searched for and request genomic data. It gives us a secure way to transmit data, through data streaming on a blockchain. It solves problems related to the reluctance to share data due to privacy concerns. It helps create a marketplace, maintain security, privacy and transparency as well as serve as a means of payment via a native token. It can even bring subjects into their own health and data curation, utilizing the consent procedures and tracking use of their own data over time for  better self-knowledge.

Has there been any progress on enterprise licensing in this space, particularly in higher education?

We do not sell licenses. We have a free market, so no one needs to buy a license from us to use the market. We do know of a couple research groups actively using or intending to use the Gene-Chain for their studies, and our interactions with them have actually helped us to refine the user experience.

Health record blockchains, which are in various stages of development and fraught with a number of HIPAA issues, will begin integrating with our genomic blockchain. This is because the data is so easily de-identified, meaning that its use will tend to be less difficult to deal with, legally speaking. Additionally, genomic blockchains for animals, plants, the microbiome, and blockchains for metabolomics and proteomics will start to be developed. The standard currency for genomic data transactions as well as transactions for other types of health data will be our DNA token. The network effect will make the first successful genomic blockchain essentially the Westlaw of genomic data.

Your long-term vision and hope in terms of blockchain applications in the Genomics/Bioscience industry.

We hope that our Gene-Chain innovation will become basically the TCP/IP of genomic science. Most people won’t realize its role as the central connecting thread for scientists and individuals, storing and sharing genomic data for health, science, and commerce, but it will be the vital marketplace to which researchers, companies, and individuals go to transmit and monetize their data. Individuals will use it for free, as we have built it, to store and use their data for their own health, while choosing to expose metadata so that they could have their data used for science, and even get paid for it. It will be a quiet, but complete revolution in health and science.