January 25, 2021

Using Genetic Grammar, MIT NLP Model Examines How Viruses Escape

Oliver Peckham

(red_diamond/Shutterstock)

As the COVID-19 pandemic crosses the one-year mark, its various mutations are increasingly dominating headlines. Part of this is due to some variants’ increased infectiousness, but much of it also stems from an underlying worry: what if this mutation resists the vaccine? This phenomenon, called viral escape, is a serious problem, particularly for viruses that are more prone to mutating exactly where vaccines and antibodies target them. Now, MIT researchers are using natural language processing (NLP) models to understand how viral escape occurs.

“Viral escape is a big problem,” said Bonnie Berger, a professor of mathematics, the head of the computation and biology group in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), and one of the senior authors of the paper, in an interview with MIT’s Anne Trafton. “Viral escape of the surface protein of influenza and the envelope surface protein of HIV are both highly responsible for the fact that we don’t have a universal flu vaccine, nor do we have a vaccine for HIV, both of which cause hundreds of thousands of deaths a year.”

To understand the rates at which these different viruses mutate while remaining functional, Berger and her colleagues applied NLP modeling, which – as the name suggests – were designed to analyze and predict linguistic patterns. In essence, the researchers treated the genetic sequences as sentences and the constraints of viral functioning as a form of genetic grammar that must be obeyed.

“If a virus wants to escape the human immune system, it doesn’t want to mutate itself so that it dies or can’t replicate,” explained lead author and MIT graduate student Brian Hie. “It wants to preserve fitness but disguise itself enough so that it’s undetectable by the human immune system.”

The researchers trained the model on 60,000 HIV sequences, 45,000 flu sequences and 4,000 coronavirus sequences. Then, they used the model to predict which parts of key viral proteins were more or less likely to “escape.” This information is critically useful, as it suggests which parts of those proteins – for instance, the S2 subunit of SARS-CoV-2’s spike protein – might be among the most future-proof as drug targets. Moving ahead, the researchers are looking at whether they could identify targets for cancer vaccines.

“There are so many opportunities, and the beautiful thing is all we need is sequence data, which is easy to produce,” said Bryan Bryson, an assistant professor of biological engineering at MIT and another of the paper’s senior authors.

Applications: Artificial Intelligence

Technologies: Middleware

Sectors: Academia, Biosciences

Vendors: MIT

Tags: coronavirus, COVID-19, MIT, natural language processing, NLP, viral escape, virology

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Using Genetic Grammar, MIT NLP Model Examines How Viruses Escape

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 10, 2024

May 9, 2024

May 8, 2024

May 7, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Using Genetic Grammar, MIT NLP Model Examines How Viruses Escape

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 10, 2024

May 9, 2024

May 8, 2024

May 7, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link