Follow Datanami:
November 19, 2018

Machine Learning’s Big Role in Population-Level Genetic Study


A large-scale genetic project is currently underway in Nevada that’s using advanced analytics and machine learning to identify connections between people’s genes and their health. It’s the first project to study genetic data at a population level, and it could be a model for a national program.

The Healthy Nevada project began in September 2016, when all residents of northern Nevada were invited to take a genetic test, at no charge to them. Much to the surprise of the project leaders, 10,000 Nevadans signed up in just 48 hours. The second phase of the project began earlier this year, and so far a total of 35,000 genetic tests have been conducted.

People who choose to participate are assured that the privacy of their genetic data is maintained at all times. They get the results of their DNA test and they can choose to share it or not.

The group is currently conducting clinical-grade DNA test that are more detailed than the consumer-grade tests from outfits like 23 & Me and Those outfits use single nucleotide polymorphisms (SNP) tests, which measure the degree of separation across the population, whereas the excome-level tests that Healthy Nevada sponsors includes detailed information on the subject’s predispositions to certain genetic diseases or conditions.

Project participants can discuss the results of these exome-level DNA tests in a confidential manner with their physician, if they choose. But the project has a second goal, which is to assess the genetic disputation of the community as a whole.

To that end, the project’s backers – including Renown Health, the Desert Research Institute, and Helix – have created a data warehouse to analyze a host of different data types, including the DNA tests genetic, healthcare data pulled from Renown’s electronic medical records system, as well as data on environmental and social factors impacting the participants.

By combining these different data sets together and analyzing them with tools from SAS, the Healthy Nevada project hopes to identify widespread patterns of disease that would otherwise be difficult – if not impossible – to detect on an individualized patient level, according to Dr. Tony Slonim, the CEO of Renown Health, which is the main healthcare provider for a 100-square-mile swath of rural landscape stretching from Sacramento to Salt Lake City.

“Through machine learning and data science, we’re looking to really start to understand patterns of health and illness in our community,” Dr. Slonim tells Datanami. “There are patterns of diagnostic conditions that are occurring at higher rate and frequencies than we might like, and we’re trying to figure out why.”

Genetic Data Trends

Scientists doing population-level analyses typically need to collect data from 20% of a given population to get iron-clad results, according to Dr. Slonim. Against that standard, Healthy Nevada would need about 90,000 DNA tests for a population of 450,00, so it’s a little over one-third of the way there.

Dr. Slonim expects that threshold to be passed in 2019. But already in these early days, a few surprising genetic markers associated with diseases have popped up in larger-than-expected numbers, including familial hypercholesterolemia, BRCA1, and Lynch Syndrome.

“We hope by January we’ll have the top 10 genetic conditions scored and algorithms drawn for them so people who have those genetic risk factors can get the kind of screening care that they need,” he says.

Data also suggests residents of northern Nevada die due to heart disease, cancer and chronic lower respiratory disease at a rate that’s 33% higher than the national average. The Healthy Nevada project will investigate the data to see if there are any correlations that could start to explain why.

“That’s exactly why I think that predictive analytics is so important, because it can find patterns of condition that we might not construe ourselves and then we can interrogate them appropriately,” Dr. Slonim says.

Healthy Data Science

Some aspects of health don’t need data science. The correlation between obesity and diabetes, for example, is clear and well accepted. However, there are many correlations hidden in the data that have yet to be found.

“We know about the relationships between weight and diabetes. We don’t know the relationship between weight, blue eyes, and cancer,” Dr. Slonim says. “I made that up but that’s an example of six degrees of separation. The brain can only go so far in identifying a pattern. …That’s the value of having machine learning … for tens of thousands or hundreds of thousands of participants. You find and identify relationships that you don’t know about.”

Healthy Nevada is using machine learning to investigate any anomalous correlations around heart disease, cancer and chronic lower respiratory disease. Early results suggest it may have something to do with the weather. “We have uncovered certain weather patterns that make a difference with cardiac conditions and the amount of times that people get admitted or go see their doctor,” Dr. Slonim says.

Dr. Slonim foresees a time when the weather forecast could impact the advice that doctors give to their patients. For example, if an inversion layer is due to arrive in a matter of days, doctors could send an email to patients directing them to increase their medication to avoid re-admission to the hospital. “That’s the kind of power this has,” he says.

Healthy Nevada is on track to have 50,000 participants by the end of the year, and expects to reach half a million people by 2020. That in itself is reason to celebrate, says Dr. Slonim.

“We had a bunch of naysayers who said we’ll never collect genetic data on 50,000 people. ‘There’s too many privacy issues, there’s too many consent problems,'” he says. “The hell with it! Go do it and stop whining! That’s why they call this leadership. If you do it, you actually might find things that makes a difference in people’s lives.”

The plan currently calls for the program to be expanded to three to five other healthcare systems next year, and potentially more in 2020. If things go well, the program could eventually go national in the hears to come.

“Right now it’s the Healthy Nevada project,” Dr. Slonim says. “But hopefully it can be Healthy America or Healthy USA as we continue evolve.”

Related Items:

Five Ways Big Genomic Data Is Making Us Healthier

Startup Says ‘What If’ to Genetic Data Analytics

Saving Children’s Lives with Big Genomics Data