For Data Lovers, COVID-19 Is the Best, Worst of Times
COVID-19 has generated its share of data. Never before have we been witness to a global pandemic with such an arsenal of data collection and analytics tools at our disposal. But depending on where you sit, the explosion of coronavirus data and accompanying analytics has either risen to the occasion failed to live up to expectations.
Count Joe DosSantos among those on the positive side of COVID-19’s data ledger. “As a data guy, I can’t tell you how awesome it is to hear ‘bend the curve’ and ‘R0’ in conventional conversation,” the Qlik chief data officer says. “It’s awesome. It tickles my fancy.”
When he’s not busy crafting a data-driven response to Qlik’s work-from-home mandate or overseeing other data-related tasks from his Massachusetts home, DosSantos can be seen perusing Worldometer and other sources of data about the global pandemic. As a data expert, he prefers to browse COVID-19-related data and form his own conclusions rather than accepting the interpretations of others.
“I don’t get my information from MSNBC, CNN, Fox News, or even ABC and CBS. It’s not interesting to me,” he tells Datanami. “What’s interesting is the numbers and what they tell, so when somebody is telling a story about that, I can tell whether or not it holds water. That is Nirvana.”
Qlik has been a big backer of the importance of data literacy, and the current COVID-19 pandemic — with its innumerable dashboards and daily dissertations on statistics during the evening news — has been a case study in the importance of data literacy. “That’s what we call data literacy – the ability to argue, to be able to understand and argue with data, not just as a business but as a person,” DosSantos says.
As the pandemic creeps into the summer, there’s no signs that it will cease generating scads of data, and the data hawks among us will eagerly scoop it up. COVID-19 dashboards will continue to proliferate, providing data lovers with an array of views into all aspects of the pandemic and the associated economic lockdown.
“This isn’t going to turn people into data scientists,” DosSantos says. “But if it makes people 5% more attuned to data, that’s awesome.”
But not everybody is thrilled with the state of data or the analytics surrounding coronavirus.
As we’ve covered in this newsletter several times, COVID-19 has laid bare systemic data management issues that historically plagued groups of people trying to analyze large amount of data. For these folks, the problems that have surfaced around data accuracy, data governance, privacy, and modeling are reminders that good analytics requires a tremendous amount of work and dedication, and that failure is an option.
Karen Way is the managing director of health plan data and intelligence with NTT Data Services, a Plano, Texas-based subsidiary of the Japanese system integrator NTT Data. During the COVID-19 pandemic, Way has been helping health insurance companies like Harvard Pilgrim, Anthem, UnitedHealthcare understand how to adapt their businesses to COVID-19, using a data-based approach.
“It’s not going to go away any time soon, at least not until we have a vaccine,” Way tells Datanami. “So we’re working with health plans to determine what kind of benefit configuration are they going to need for their populations.”
Way says the lack of widespread COVID-19 testing has hamstrung the ability of experts to take a data-centered approach to evaluating the situation. “The one thing that I’ve seen in looking at the data sources [is] it’s very limited,” she says. “There were so many people who weren’t being tested that there was an automatic bias in the data.”
Only people in hospitals were being tested for coronavirus early on in the pandemic, and that skewed the data. “So any predictive model that you built was based only on that population that was not randomly selected,” she says.
As time rolled on, other people came to conclusions about the type of people that coronavirus impacts the most. COVID-19 doesn’t appear to impact everybody equally, and seems to save its worst for people who are members of certain genders, races, ages, geographic locations, and socio-economic statuses. But that generalization, too, needs qualification.
“Even now, you hear people say ‘This is a disease of the elderly,’ where in fact there are younger people who are infected, they are being hospitalized and ending up in the ICU,” Way says. “Not the same degree or percentages, if you look at age ranges. But if you build predictive models without taking those kinds of things into consideration, whatever you predict is going to be skewed.”
The morbidity models for COVID-19 have been in constant flux as old assumptions are cast away and new data is brought in. Even three months in, we’re seeing big assumptions nixed by public health authorities, such as when the Center for Disease Control and Prevention concluded last week that coronavirus is not transmitted by surface contamination.
“I’m still not convinced that the models that are out there are truly indicative of what’s going to occur,” Way says. “I’m not an infectious disease specialist. I’m not even a statistician. I’m just a well informed and talented data geek that works in healthcare that sees these things happening and makes me question what is truly reality.”
In lieu of good data, we must delegate responsibility to authorities, such as Anthony Fauci and Deborah Birx, both medical doctors who are members of the White House’s coronavirus taskforce. It’s doubtful they have better data, but they do have relevant experience and knowledge of how past pandemics have played out. Until we get better data, we’re stuck trusting the gut instinct of experts, she says.
“I’m one who likes to look at the data and say, ‘Here’s the story the data is telling me,’” Way says. “I can’t see the story in the data yet because there’s just not enough data that’s reliable for me to feel confident in things that I’m looking at to say, yes here’s the pattern I see.”