Follow Datanami:
August 7, 2019

Stanford Researcher Develops Data Standards for Brain Imaging and Applies Computational Methods to Work

August 7, 2019 — In recent years, efforts to understand the workings of the mind have taken on new-found urgency. Not only are psychological and neurological disorders — from Alzheimer’s disease and strokes to autism and anxiety — becoming more widespread, new tools and methods have emerged that allow scientists to explore the structure of, and activity within, the brain with greater granularity.

Survey ontology. 66 survey dependent variables were projected onto 12 factors discovered using exploratory factor analysis, represented by the heatmap. Rows are factors and columns are separate dependent variables ordered based on the dendrogram above. Image courtesy of TACC.

The White House launched the BRAIN Initiative on April 2, 2013, with the goal of supporting the development and application of innovative technologies that can create a dynamic understanding of brain function. The initiative has supported more than $1 billion in research and has led to new insights, new drugs, and new technologies to help individuals with brain disorders.

But this wealth of research comes with challenges, according to Russell Poldrack, a psychology professor with a computing bent at Stanford University. Psychology and neuroscience struggle to build on the knowledge of its disparate researchers.

“Science is meant to be cumulative, but both methodological and conceptual problems have impeded cumulative progress in psychological science,” Poldrack and collaborators from Stanford, Dartmouth College and Arizona State University wrote in a Nature Communications paper out in May 2019.

Data Archivist 

Part of the problem is practical. With hundreds of research groups undertaking original research, a central repository is needed to host and share data, compare and combine studies, and encourage data reuse. To address this curatorial challenge, in 2010 Poldrack launched a platform called OpenFMRI for sharing fMRI studies.

Prediction of target outcomes using survey factor scores, estimated from 2500 shuffles of the target outcome. Ontological fingerprints displayed as polar plots indicate the standardized beta value for each significant survey factor. The ontological fingerprint for the two best predicted outcomes are reproduced at the top.

“I’d thought for a long time that data sharing was important for a number of reasons,” explained Poldrack, “for transparency and reproducibility and also to help us aggregate across lots of small studies to improve our power to answer questions.”

OpenFMRI grew to nearly a hundred datasets, and in 2016 was subsumed into OpenNeuro, a more general platform for hosting brain imaging studies. That platform today has more than 220 datasets, including some like “The Stockholm Sleepy Brain Study” and “Neural Processing of Emotional Musical and Nonmusical Stimuli in Depression,” that have been downloaded hundreds of times.

Brain imaging datasets are relatively large and require a large repository to house them. When he was developing OpenFMRI, Poldrack turned to the Texas Advanced Computing Center (TACC) at The University of Texas at Austin to host and serve up the data.

A grant from the Arnold Foundation allowed him to host OpenNeuro on Amazon Web Services for a few years, but recently Poldrack turned again to TACC and to other systems that are part of the NSF-funded Extreme Science and Engineering Discovery Environment (XSEDE) to serve as the cyberinfrastructure for the database.

Part of the success of the project is due to the development of a common standard, BIDS — Brain Imaging Data Structure (BIDS) — that allows researchers to compare and combine studies in an apples-to-apples way. Introduced by Poldrack and others in 2016, it earned near-immediate acceptance and has grown into the lingua franca for neuroimaging data.

As part of the standard creation, Poldrack and his collaborators built a web-based validator to make it easy to determine whether one’s data meets the standard.

“Researchers convert their data into BIDS format, upload their data and it gets validated on upload,” Poldrack said. “Once it passes the validator and gets uploaded, with a click of a button it can be shared.”

Data sharing alone is not the end goal of these efforts. Ultimately, Poldrack would like to develop pipelines for computation that can rapidly analyze brain imaging datasets in a variety of way. He is working with the CBrain project, based at McGill University in Montreal, Canada, to create containerized workflows that researchers can use to perform these analyses without requiring a lot of advanced computing expertise, and independent of what system they are using.

He is also working with another project called BrainLife.io based at Indiana University, which uses XSEDE resources, including those at TACC, to process data, including data from OpenNeuro.

Many of the datasets from OpenNeuro are now available on BrainLife, and there is a button on those datasets that takes one directly to the relevant page at BrainLife, where they can be processed and analyzed using a variety of scientist-developed apps.

“In addition to sharing data, one of the things that having this common data standard affords us is the ability to automatically analyze data and do the kind of pre-processing and quality control that we often do on imaging data,” he explained. “You just point the container at the data set, and it just runs it.”

To read the full story: https://www.tacc.utexas.edu/-/raising-the-standard-for-psychology-research


Source: Aaron Dubrow, Texas Advanced Computing Center

Datanami