June 16, 2020

NIH Launches Massive Initiative for COVID-19 Patient Data Analytics

Oliver Peckham


As the COVID-19 pandemic begins to rear its head again in the wake of reopenings in the U.S., accelerating the research pipeline for treatments and vaccines is more important than ever. Now, the National Institutes of Health (NIH) is launching a massive COVID-19 patient data store and analytics platform to help researchers understand the disease and develop effective treatments.

The centralized data store is part of a new effort called the National COVID Cohort Collaborative (N3C), which is operated under the NIH’s National Center for Advancing Translational Sciences (NCATS). The data store – called the NCATS N3C Data Enclave – will “systematically collect clinical, laboratory, and diagnostic data from healthcare provider organizations worldwide.” So far, this includes 35 collaborating sites across the country.

Then, the analytics component of the enclave will convert all of the data into a standard format, allowing it to be shared with researchers and healthcare providers around the world – regardless of whether or not they contribute data themselves. In order to comply with health data privacy laws like HIPAA, the data will only retain two identifying elements: zip code and dates of treatment. The platform is built to suit machine learning and robust analytical approaches.

“NCATS initially supported the development of this innovative collaborative technology platform to speed the process of understanding the course of diseases, and identifying interventions to effectively treat them,” said Christopher P. Austin, director of NCATS. “This platform was deployed to stand up this important COVID-19 effort in a matter of weeks, and we anticipate that it will serve as the foundation for addressing future public health emergencies.”

A snippet of the new analytics tool. Image courtesy of N3C.

A demonstration of an early version of the analytics tools shows adjustable dashboards with patient statistics, as well as a map that shows the spread of COVID-19 infections and treatments over a geographic space and over time. NIH says that the enclave and analytics tool will help researchers answer questions like “Who might need to be on a ventilator because of lung failure?” and “Are there different patient responses to coronavirus infection that require distinct therapies?” The data collection and analysis will also continue over a five-year period, allowing researchers to explore the long-term implications of COVID-19.

“The exciting transformation this platform represents is in providing an environment where data and the power of the analytics can be used by researchers and clinicians to quickly examine and answer new COVID-19 hypotheses,” said Warren A. Kibbe, chief of translational biomedical informatics in the NIH’s Department of Biostatistics and Bioinformatics.

N3C is also making use of the NIH’s Clinical and Translational Science Awards (CTSA) and Center for Data to Health (CD2H) – both, again, funded by NCATS.

“By leveraging our collective data resources, unparalleled analytics expertise, and medical insights from expert clinicians, we can catalyze discoveries that address this pandemic that none of us could enable alone,” said Melissa Haendel, director of CD2H at the Oregon Health & Science University School of Medicine and co-lead of N3C.

