Follow Datanami:
June 10, 2020

SAS Provides Big Data Solutions for… Bees?

In recent years, it has become increasingly clear that bees are facing an existential threat, with colonies collapsing across the globe – in the U.S., up to 40% in 2018-2019 alone. Bees, of course, are critical to our food chain: they pollinate around three quarters of fruits, nuts, and vegetables in the United States (according to the U.S. Department of Agriculture, around one in four bites of food can be attributed to bee pollination). Monitoring and maintaining bee colonies can be tricky, though, and is typically left to expert beekeepers performing manual checks. This time-consuming, potentially disruptive process is infrequent and inefficient – but big data, courtesy of researchers from SAS, is here to help.

Researchers from the SAS Institute worked to develop a bioacoustic monitoring system: that is, a system that could automatically hone in on the connections between the sounds of a beehive and various forms of colony health or distress. This work is an evolution of a number of prior automated hive monitoring approaches, including attempts to monitor beehive health through measuring weight, temperature, humidity, and gas content. But for real results, these researchers argue, you have to look to video and audio. “These systems collect much more data more frequently and require high performance computing for processing,” they wrote. “However, they have the potential to tell us more important information about what is occurring in the hive.”

The signs are subtle – but they aren’t new. As early as the 1950s, researchers were noticing that the failure of a colony’s queen could be acoustically detected. Other attempts have been made over the years, such as the “Apivox Smart Monitor” app, released in 2013, that could be used to check in on hive health without opening the hive. Primarily, the tools listened for “piping”: distinctive noises made by queens and workers that can indicate when the colony is queenless, when a new queen has emerged, and more. 

SAS entered the arena with an odd built-in advantage: four beehives on its main campus in Cary, North Carolina. Inspired by the urgency of the issue, the SAS Internet of Things Division equipped the beehives with sensors with the aim of developing a bioacoustic monitoring system of their own. 

They began by dropping a microphone into a colony that was purposely left queenless, quickly finding that bees crawling on the microphone and outside noises made the audio near-useless without processing. They filtered the audio to remove frequencies they weren’t interested in, but found that the issues persisted. To resolve these issues, they turned to robust principal component analysis (RPCA) in SAS Viya, which is frequently used to remove noise in data. The researchers found that it was “quite successful” in separating the background and foreground sound when conducting low-rank matrix estimates of the frequencies over a ten-minute period. Building on this work, they then refined their detection of other noises from the beehive.

An example of the effects of the filtering techniques. The original data is pictured on the left, while different RPCA approaches are pictured on the right. Image courtesy of the authors.

“We plan to track this ten-minute spectral estimate and various features of it in SAS Event Stream Processing,” the research team wrote, outlining the pipeline they had developed using the SAS tools.

“In our experiment, where we made a colony queenless,” the researchers summarized, “we were able to detect worker bees piping at the same frequency range at which a virgin queen pipes after a swarm. We speculated that the workers were calling out to assess whether a queen was present, just as the virgin queen does. Given the importance of the queen in the hive, it is critical for the beekeeper to know as early as possible whether there has been a queen event.” 

Although the researchers acknowledge that “hive monitoring is still in its infancy,” they celebrated the progress they had made. “We now have a system design and plan to begin implementing the system very soon,” they concluded.

The research discussed in this article was conducted by Yuwei Liao, Anya McGuirk, Byron Biggs, Arin Chadhuri, Allen Langlois, and Vince Deters. The report can be accessed here.

Datanami