People to Watch 2022
Associate Director, Data Science
Director of the Office of Data Science Strategy (ODSS),
National Institutes of Health (NIH)
Susan Gregurick, congratulations on being named a Datanami Person to Watch in 2022!
Thank you, I am thrilled to be named a 2022 Datanami Person to Watch. Datanami is an essential forum for tracking emerging trends and solutions in big data. The need to keep up to date on trends in data science in academia and industry is even more crucial during this pandemic. Tremendous amounts of data are necessary to properly address the challenges we face, and it must be shared, evaluated, and utilized in fair, equitable, and innovative ways.
How important is data science to the future of science and health in this country?
Data science isn’t just important to the future of science and health. It IS the future. Just as antibiotics, the x-ray machine, and the sequencing of the human genome changed medicine and science, the evolution of data science will be just as impactful. Data science generally, and AI/ML in particular, have the potential to revolutionize healthcare from the academic researcher all the way to our primary care physicians. And if properly implemented, we have the opportunity to reduce or eliminate many of the diseases and health disparities that we face today.
What lessons has the NIH learned from the COVID-19 pandemic regarding the importance of data access and data sharing in the biomedical field?
The COVID-19 pandemic has shown a spotlight on numerous data access and sharing challenges, including finding patient data across data platforms, harmonizing clinical data from various contributors, streamlining access, and linking data from multiple sources for a greater understanding of SARS-COV-2. Through early efforts, primarily working with the NCATS National COVID Cohort Collaborative (N3C), we have seen great opportunity and capabilities to harmonize clinical data at unprecedented scale. But this takes a large community to all agree on standards and processes of harmonization.
We also explored the challenges and opportunities of records linkage. We heard from the research community that if properly deployed, record linkages can create richer data about the human experiences of individuals, families, places, and events, including those often underserved. However, participants should consent and be notified when records are linked, and the impact of invalid or incorrect linkages should be examined, as improperly linked data may obscure the ability to identify disparities. A summer 2021 workshop highlighted challenges in linking data, and is helping NIH develop policies and strategies to provide access to and linkages of data to address COVID-19.
We are working tirelessly to address these data barriers, and I anticipate the work we are doing because of COVID will have long lasting and positive impacts on research well into the future. Increasing access to and usability of data across disciplines cultivates better science, medicines, and therapies.
What big obstacles remain before we can fully realize the benefits of data in the healthcare arena?
There are several large obstacles to realizing the full potential of existing and future biomedical data. One of the biggest challenges is the lack of uniformity and access to research data. This is particularly true of controlled access data from human participants in clinical and observational studies supported by NIH. Much of these data are underpinned from electronic health care systems, clinical notes, medical images, and surveys, in addition to other data such as genomics data. Developing capabilities to make these controlled-access data more findable and accessible across the NIH enterprise remains a challenge. This impedes scientific advancement because researchers cannot easily find all relevant data to create cohorts for new studies, for example on ‘long COVID’. We are working to address these and other challenges by improving our ability to streamline access to data.
Analyzing these challenges, and working to address them for COVID specifically, NIH recently launched the RECOVER program. RECOVER is a research initiative to understand, prevent, and treat PASC, including Long COVID. PASC stands for post-acute sequelae of SARS-CoV-2 and is a term scientist are using to study the potential consequences of a SARS-CoV-2 infection. Through the RECOVER initiative we are working across the NIH to collect, store, and make accessible integrated data from clinical, observational, digital health, and imaging studies in a coherent and facile manner.
Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?
I have a lot of fun in and outside of my professional life. For example, I love gardening and each year I add a new rose bush to my expanding collection. At one time I brewed beer, but I haven’t had much time to do so these days. A surprising story about me that I don’t think my colleagues know is that when I was in high school, I was the home coming queen for my small town of Davison, Michigan. I really enjoyed representing my town and meeting so many of our citizens. During my tenure I helped raise funds for the retirement home that my grandmother spent her last days in. This activity was personally meaningful for me and I have fond memories of those early and informative high school years.