July 30, 2021

Berkeley Lab Makes Strides in Autonomous Discovery to Tackle the Data Deluge

Oliver Peckham

Data production is outpacing the human capacity to process said data. Whether a giant radio telescope, a new particle accelerator or lidar data from autonomous cars, the sheer scale of the data generated is increasingly leading to massive stores of untapped data as researchers scramble to acquire computational resources and develop algorithms to exploit the treasure troves of information. Now, researchers at Lawrence Berkeley National Laboratory have made strides in a field called “autonomous discovery,” which uses algorithms to effectively decide what to investigate about a dataset with low levels of human involvement.

Autonomous discovery has grown more prevalent over the past few years. One of the more prominent approaches relies on Gaussian process regression, a Bayesian method well-suited for small datasets that enables autonomous discovery by examining a small portion of the data and engaging in probabilistic classification. “In contrast to deep learning, stochastic processes can be used to make decisions based on relatively small datasets, and they provide uncertainty estimates which can optimize the learning process,” said Marcus Noack, a research scientist at CAMERA and lead author of the new paper, in an interview with Berkeley Lab’s Kathy Kincade.

Berkeley Lab researchers in the Center for Advanced Mathematics for Energy Research Applications (CAMERA) applied Gaussian process regression to develop a tool called gpCAM. In CAMERA, researchers have been using gpCAM for synchrotron beamline experiments – but lately, its use has been expanding into other areas. “More and more experimental fields are taking advantage of this new optimal and autonomous data acquisition because, when it comes down to it, it’s always about approximating some function, given noisy data,” Noack said.

One of those new areas is materials science; gpCAM is being used by researchers in Berkeley Lab’s Molecular Foundry to help understand the properties of thin-film semiconductors. “Nanoscale applications that make use of artificial intelligence and machine learning algorithms, specifically for scanning probe systems, have been an interest … for some time,” said John Thomas, a postdoctoral research fellow at the Foundry. “We became interested in using Gaussian processes toward autonomous discovery in the summer of 2020.”

Elsewhere, researchers are using gpCAM to investigate DNA self-assembly. “DNA nanotechnology in the pursuit of self-assembling functional material often suffers from a limited ability to sample the large parameter space for synthesis,” explained Aaron Michelson, a graduate researcher at Columbia University. “Either this requires a large volume of data to be collected or a more efficient solution to experimentation. Autonomous discovery can be directly incorporated in both mining large datasets and guiding new experiments. This allows the researcher to steer away from mindlessly making more samples and puts us in the driver’s seat to make decisions.”

And, the researchers say, this is just the beginning, and gpCAM has applications ranging from environmental studies to drug discovery.

“Noack’s work and leadership have brought together a broad, interdisciplinary co-design community,” said James Sethian, director of CAMERA and a co-author on the paper. “This sort of scientific community building is at the heart of what CAMERA tries to do.”

To learn more about this research, read the research paper here and read Berkeley Lab’s Kathy Kincade’s reporting here.

Applications: Artificial Intelligence

Technologies: Middleware

Sectors: Science

Tags: Autonomous Discovery, Berkeley Lab

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Berkeley Lab Makes Strides in Autonomous Discovery to Tackle the Data Deluge

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Berkeley Lab Makes Strides in Autonomous Discovery to Tackle the Data Deluge

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link