Cloud computing offers efficiency and the ability for deep, predictive learning. Both could play a substantial role in helping reduce administration and increase the personalization of healthcare. But, what inherent characteristics of cloud computing must we overcome before we can get there?
Efficiency in the Cloud
When analyzing characteristics of cloud computing, efficiency is oft-mentioned. The “cloud” is the only economically practical place for the intermittent yet intensive workloads to be performed. The good news is that the “Big Data” tools that have emerged over the last decade to support such computing tasks. For example, ADAM, an open-source genomics analysis platform based on Apache Spark, is available from multiple public cloud providers.
This support includes not only hosting the tools themselves but providing auto-scaling computational nodes that start up as needed to support the tasks you give them and shut down when idle, effectively guaranteeing that the compute you need does not cost any more than it needs to at current commodity rates.
With Amazon Web Services (AWS), you can even use the spot market to reduce those costs further by only using spare compute power available on the spot market. At last year’s AWS re:Invent conference, a couple of different organizations explored dramatic cost reductions that could be gained using AWS for genomic processing compared to dedicated in-house data centers. Reductions can also be gained in calendar time due to the ability to flex up to a larger machine count or flex down without machines sitting idle during periods of transition.
Privacy in the Cloud
Privacy can be among the tougher characteristics of cloud computing to reconcile from a technical point of view, but we are not without a few options to consider.
What Approach to Data Privacy Is Most Secure?
De-Identification. Let’s start by dismissing the idea that we can use de-identification as a solution to the challenge of privacy in healthcare cloud data. Typically, this is the approach taken: Reduce the amount of information shared to make sure data could apply to any individual in a large enough group. Unfortunately, there is no more unique and specific way to describe person than their full genome. The only real exception to this is the case of identical twins, and even then you are down to one of two people the data could belong to.
Encryption. Encrypting the information does not help, because any system that wants to process the data has to decrypt it first. For that to happen, the system must have the key to do so. Decrypting data to process it means it is also vulnerable to being copied, stored or shared by whatever system was able to decrypt it. That kind of transitive trust at a large scale is unprecedented.
Bitcoin. We can imagine a couple of possible ways that technology can help. First, new technologies like blockchains (used by Bitcoin to enable financial transactions) may be able to secure information in a way that uses for shared information can be traced. While this does not by itself guarantee the security of the information, it would, in principle, allow you to trace what happened to your information up to the point it was leaked and therefore who leaked it. This could be key to giving organizations the accountability needed to build consumer confidence.
Distributed Computing. The more practical solution that delivers true privacy is a distributed computing model able to process, yet not disclose, the data. The data in our example would live in a single system used by a trusted individual where processing algorithms could be run on that data to answer questions. This is tricky to execute because any processing code that will operate on the data must itself be proven secure. Thus, there will be significant performance impacts in coordinating distributed computing this way. At least it is possible to provide a solution that can protect each contributor’s data but allow their data to be included in larger cohorts for processing.
Alternately, we could hope that cultural norms change so the sensitivity around disclosing this kind of information evolves — though, we can’t count on that happening as fast as needed to eliminate the problem.
Knowledge Management in the Cloud
Since we all use the Internet, we know that there is already more information available to us than we can ever possibly consume. Tools like search engines have made it more possible to get relevant information faster on demand and to create learning systems that get better over time at showing us relevant search results.
Deep Learning in Healthcare
A recent joint announcement by IBM and Illumina suggests at least one creative approach that could go well beyond making it possible to simply locate information relevant to a patient’s condition. What if your sequencing system or lab could not only provide you the raw genomic data but combine that with knowledge of the patient’s clinical history and genomic information go beyond the location of all recent or relevant publications that might apply? What if it could also summarize it for you in a report that is available within minutes?
Watson for Genomics
The Watson for Genomics software adds data from 10,000 scientific articles and 100 new clinical trials every month to improve the odds that an oncologist can have at their fingertips the most precise and current information possible related to observed genomic alterations related to the patient’s tumor. While there will never be a substitute for a skilled human mind in making a diagnostic decision, we can at least make sure physicians have the help they need to stay on top of the ever-increasing rate of precise knowledge required to make informed decisions.
Loose ends remain before the use of cloud systems and AI in healthcare can be more fully implemented. How many of the advertised capabilities are as fully-baked as they appear? How safe are public clouds? How has our ability to process and store data improved? As the healthcare and tech worlds continue to co-mingle, the opportunities to shave costs and increase precision will only grow.