With many diseases, doctors have the benefit of a blood test that, more or less, definitively proves the presence of the disease. But for other conditions, such as sepsis–a bacterial infection state that kills millions of people each year–there is no single clear-cut test. But thanks to new big data techniques that can continuously monitor and analyze the interplay of more than 100 signs and potential symptoms of sepsis, hospitals are detecting the condition earlier, and saving both lives and money.
Sepsis is an inflammatory disease state that occurs when the human body initiates an overwhelming immune response to an initial infection. It takes a surprisingly high toll on people in this country and around the world. Conservative estimates have the disease effecting 1 to 2 percent of all hospital patients in the United States, or about 750,000 Americans per year, and killing up to half of those diagnosed. However, new research suggests that sepsis is actually much more prevalent than initially thought, and that it could be killing 3 million Americans per year, and between 15 million and 20 million globally.
The fuzziness around the numbers is part of the problem. It’s difficult to get a positive diagnosis of sepsis because it mimics so many other conditions. Its major symptoms–fever, chills, rapid breathing and heart rate, rash, confusion, and disorientation–are symptoms of other diseases and conditions. Once sepsis is diagnosed, intravenous antibiotics must be applied immediately to save the patient. These drugs are not cheap, and the cost to fight sepsis in the U.S. is estimated to be between $30 billion to $50 billion.
That a little known disease state could be wreaking such havoc on the lives and budgets of millions of people was intriguing to a small group of technologist in Southern California. Steve Nathan and Christopher Rosin were working in other areas of healthcare analytics, when they learned about sepsis.
“We were partnering with a company that provided automatic collection of the data that streams off the bedside monitors–the heart rate, the respiratory rate, etc. and we started to get interested in doing something predictive and interesting with that data,” Nathan says. “We stepped back and realized that, for a complex disease state like sepsis, there’s a lot of subjectivity and ambiguity for a clinician in analyzing those clinical variables.”
The problem is, you can’t just look at the vital signs coming off those bedside machines. “You’ve got to look at all of the available the data because you might find a signal in an unexpected place or a combination of places,” Nathan tells Datanami.
Amara Health Analytics was created with the initial goal of developing a system that could help clinicians identify sepsis in a patient earlier and more accurately than the traditional methods hospitals are using. The cloud-based system they built, called Clinical Vigilance for Sepsis, has been installed at four hospitals, and is on its way to being a big data success story in healthcare.
How It Works
Amara’s Clinical Vigilance system is composed of online and offline components. Much of the hardcore data science work that Rosin and the Amara team put into the system takes place in a proprietary big data repository filled with millions of historical patient records. Here, Rosin and his team trained machine learning algorithms to find the various signals that correlate with sepsis.
“We’ve built a predictive model that looks at all the available data and look at the earliest strong indication that we can find for a patient who will eventually be diagnosed with sepsis,” Rosin tells Datanami. “We’ve used machine learning techniques to look for the early signal that a patient’s headed to treatment and diagnosis for sepsis. We’ve really been able to find it. It’s hybrid of established guidelines from the Surviving Sepsis Campaign, and the predictive model.”
Whereas the Surviving Sepsis Campaign trains doctors and nurses to look at about 20 different data variables, Amara’s Clinical Vigilance system casts a much wider net, and looks at more than 100 different variables, which also includes facts derived from raw data. Inputs into the system include real-time telemetry data from bedside machines; structured data, such as such as medical codes and other numeric values; and unstructured data, such as doctor’s notes, operative reports, and discharge summaries. These are the sources of data that get collected by Amara’s system at runtime.
The Clinical Vigilance runtime is a pure Java-based application. Data for each patient is continuously collected and stored in memory. Doctor’s notes and other written data sources are run through natural language processing (NLP) routines to extract meaning. These NLP routines are critical to the system because some of the symptoms of sepsis, such as altered mental states, are subjective determinations that can only be deduced by understanding doctors’ notes.
When Clinical Vigilance detects that a patient is headed toward sepsis, the system sends a text alert to the doctor or nurse. That is the only clinical interface to the system. The company uses the DataStax Enterprise NoSQL database to store all the clinical data for the purpose of running queries and generating reports for hospitals. The company was leaning toward a NoSQL system because of the need to have flexible schemas, and finally selected DataStax’s Cassandra distribution primarily because of its strong Lucene-based search capabilities.
Saving Lives and Money
Amara’s capability to find signals across potentially millions of data points for a single patient is the source of its big data-driven sepsis breakthrough.
“It does not come down to one signal. It’s multiple factors and it tends to be different from one patient to another,” Rosin says. “Sepsis is a disease state that researchers have been broadly looking for one signal and diagnostic companies have been looking for the one biomarker, and no one’s really found it. And we haven’t either. There are different criteria that apply to different patients. It ends up being pretty complex, which is okay for us, because we have millions of patient records available to us for data mining. That lets us derive a model that has significant complexity, while still allowing us to verify it.”
Hospitals that adopt Clinical Vigilance are able to detect sepsis earlier, which translates into quicker administration of antibiotics and a shorter stay in the hospital. For a 300-bed hospital, the average savings is about $2 million in direct savings, as measured in the length of stay.
Amara’s system also has a low rate of false positives. “Many companies have attempted to do a sepsis alerting system, but almost all of them have suffered from high false positives,” Nathan says. “What we do both with natural language processing and machine learning algorithms is look at the entire context of the patient before the alert is sent out. Our specificity–the accuracy of the system across all adult patients–is about 99 percent, much higher than competitors.”
Amara is targeting sepsis today, but sees the potential to broaden its reach to provide early detection of other diseases in the future. In addition to specific diseases, such as acute kidney injury and cardiac conditions, Amara is considering a big data approach to identifying general patient deterioration with any underlying root cause.
The potential for hospitals to make use of big data is just now being realized. “The data that hospitals are now collecting on a massive scale, this EHR [electronic health record] data–it’s goldmine for applications like this,” Nathan says. “We’ve hit the tipping point now for medical records. It doesn’t mean that all hospitals are fully digitized yet. But it’s really underway.”
Why Medicine Needs Big Data
Big Data for the Common Good
Can Big Data Tame MRSA Superbugs?