AI-Powered Drug Development in a Post-COVID World
The developed world is on the cusp of turning the corner in the fight against COVID-19 thanks to the unprecedented effort to rapidly develop and distribute effective vaccines. Now technologists are hoping to take drug development to the next level, and AI will play a big role.
One of the companies at the forefront of using machine learning and AI to develop drugs is CytoReason. The company helps pharmaceutical firms like Pfizer accelerate drug development by providing high resolution models of the human body that’s infected with the disease that the drug companies are targeting.
“If I told you that in 200 years, drugs would be developed in a computer, you would not be real surprised,” said CytoReason CEO and founder David Harel. “But in order to do that, you need to build a model of the human body. You need some kind of an android where you can try your drugs at a very high resolution. And that’s basically what we’re doing – tissue by tissue, disease by disease, we’re modeling the human body to an accuracy level that allows the development of drugs.”
There are two key breakthroughs enabling CytoReason to build such detailed models: the vast amounts of medical data that’s been accumulated, and the AI technology that enables drug makers to find useful information hidden in that data, Harel says.
On the data front, improvements in data privacy and security, including privacy-preserving AI, have enabled CytoReason to accumulate large amounts of medical data that is critical to building accurate models. Not too long ago, it would have been unethical or illegal to amass such large amounts of sensitive data. Now companies like CytoReason are able to tap into the large amount of medical data without compromising privacy or security.
On the AI front, CytoReason is able to use data collected from clinical trials to build models that mimic the behavior of diseased and non-diseased organs, down the cellular level. These models allow pharmaceutical companies to determine how to attack a disease in two ways, Harel says.
“They’re looking into what targets they might want to push in order to stop the disease and treat the patient. That’s one,” he says. “The second one is, they have our model–they don’t need any data–they just look at it and say, what if I push here or there? What would happen? Would the disease get better or worse?”
The Israeli company builds models using a homegrown statistical platform that runs atop a CPU-based infrastructure in the cloud. While deep learning approaches could help, the lack of traceability in a black-box environment precludes the use of neural networks.
“We’re developing a lot of statistical learning frameworks ourselves just to put all the data together into one model,” he says. “The people who find new drugs and treatments are biologists. They’re not computer scientists. They’re not data scientists. So we’re trying to model the disease in a way that’s useful to use.”
In the future, the marriage of “deep phenotyping” approaches that companies like CytoReason’s are taking along with mining of real-world evidence from electronic medical records (EMRs) could dramatically reduce the need to conduct massive clinical trials to prove the safety and efficacy of novel drugs. The clinical trials would still be needed, but the number of participants could be reduced by 40% to 50%, Harel says, which would be a huge savings.
“That’s going to be the future,” he says. “If you’re asking yourself, what is everybody thinking about the future of drug development post COVID, that’s basically what everybody is looking at….Once it’s done properly, it speeds things up greatly.”
AI to Manage Clinical Trial Data
The mRNA-based COVID-19 vaccines created by Pfizer-Biontech and Moderna show how much progress has been made on the science side of the pharmaceutical house. And novel modeling approaches, such as those from CytoReason, are on the cusp of speeding things up even more.
But not as much progress has been made on the underlying data systems that pharmaceutical companies and biotech firms use to manage the actual clinical trials that are necessary to get a drug approved for use, says Raj Indupuri, the CEO of eClinical Solutions.
“You’d be surprised how companies are still depending on Excel to look at the data,” Indupuri says. “There’s a lot of progress that has been made with the science, but unfortunately, in terms of clinical development, I don’t think over the last two decades much has happened.”
Many drug companies rely on legacy IT systems to manage their clinical trials. They hire armies of folks to enter data into the system, to cleanse the data, and to generate the reports that the U.S. Food and Drug Administration (FDA) requires. This takes a lot of time and money, which is why bringing a single drug or therapy to market can take 10 to 15 years and cost $2.5 billion to $4 billion, Indupuri says.
“It’s an industry where we lag because we’re very conservative,” he says. “We deal with clinical patients and data related to that, so it’s very controlled. But that shouldn’t be an excuse in terms of adopting modern technologies. I do believe that technology is a key enabler to help with this transformation and to deal with some of the inefficiencies.”
eClinical Solutions develops a cloud-based data and analytics platform that automates several aspects of running a clinical trial. For starters, it enables drug companies to ingest raw data from multiple sources, and transform and cleanse the data to make it available for analysis. This could be demographic data from clinical trial participants, biomarker data, laboratory data, and data from EMR and insurance claims systems.
Once the data is ingested into the platform, the Massachusetts-based company uses data visualization and machine learning techniques to help biostatisticians and scientists spot anomalies and outliers in the data. This could indicate that different cohorts are responding to a novel treatment in different ways, which could inform a change in the treatment is necessary. Or it could help to spot issues occurring in a specific hospital that’s participating in the trial, which could lead to changes in how the trial is conducted.
Finally, the company’s software helps generate the reports that its clients will submit to the FDA to gain approval for their new drug or treatment. All told, the automation can help increase the efficiency of a phase 1-3 clinical trial by 40% to 50%, Indupuri says.
Machine learning and AI is necessary to gain the scale necessary to automate some of the most difficult aspects of managing data in a clinical trial, Indupuri says. “We have this enormous stream of data from patients, and there’s no way you can [manage that efficiently] without having modern data infrastructure or data pipeline, or taking advantage of advanced data science techniques, like ML models or AI,” he says.
The company is rolling out a new machine learning system that will automatically transform data based on how human curators have manually cleansed raw data in the past.
“What we have done is we’ve used that information to train the ML model,” he says. “It can detect outliers and can detect data issues and provide outputs so that a reviewer can quickly confirm and take action. It eliminates the need to manually look at so many data points. Without an ML model, it wouldn’t be scalable. As the amount of data increases, this would not be feasible.”
Data is the critical element that will enable us to design and test better drugs at a faster cadence. As the volume and variety of data about diseases and drugs reactions builds, machine learning technology will be absolutely essential to making sense of it in a timely manner.