February 12, 2021

MIT’s New ‘SpAtten’ Tool is Paying Attention to Your Sentences

Oliver Peckham

(Wright Studio/Shutterstock)

In an episode of The Office, the character Kevin Malone famously opined: “Why waste time say lot word when few word do trick?” Indeed, language can be inefficient, leading to bloated and less-accurate natural language processing (NLP) models. This has given rise to attention mechanisms, which help NLP models identify key words, in popular models like OpenAI’s GPT-3. These tools are now also at the heart of MIT’s new “SpAtten” model, a combined hardware-software system for streamlining NLP through a robust attention mechanism.

The most powerful NLP models are robust, but come at extraordinary computational expense. “This part is actually the bottleneck for NLP models,” said Hanrui Wang, a PhD candidate at MIT and lead author of the paper presenting SpAtten, in an interview with MIT’s Daniel Ackerman. “We need algorithmic optimizations and dedicated hardware to process the ever-increasing computational demand.”

Enter SpAtten, which delivers both as a single, integrated platform. SpAtten, for instance, uses a technique called “cascade pruning” to jettison vestigial words – and all associated data work on those words – once key words are identified. SpAtten is also able to use lower-precision analysis for simpler sentences, only breaking out the big guns when faced with a complex sentence. And, on the hardware front, SpAtten is highly parallelized, allowing it to simultaneously assess every word in a given sentence. “Our system is similar to how the human brain processes language,” Wang said. “We read very fast and just focus on key words. That’s the idea with SpAtten.”

For now, SpAtten’s hardware only exists in simulations, but in those simulations, it performed over a hundred times faster than an Nvidia Titan Xp GPU (the next fastest hardware tested) and, according to MIT, a thousand times more efficiently. Combined, the speed and efficiency advantages have serious implications for reducing the energy demands of advanced NLP models in the future – assuming SpAtten’s hardware performs similarly in real life.

“Our vision for the future is that new algorithms and hardware that remove the redundancy in languages will reduce cost and save on the power budget for data center NLP workloads,” said Wang, who went on to imagine the kinds of impacts that SpAtten-like technology could have on other major AI- and NLP-driven sectors. “We can improve the battery life for mobile phone or IoT devices. That’s especially important because in the future, numerous IoT devices will interact with humans by voice and natural language, so NLP will be the first application we want to employ.”

Applications: Artificial Intelligence, Research Analytics

Technologies: Middleware

Sectors: Academia

Vendors: MIT

Tags: attention mechanisms, MIT, NLP, SpAtten

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

MIT’s New ‘SpAtten’ Tool is Paying Attention to Your Sentences

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

April 19, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

MIT’s New ‘SpAtten’ Tool is Paying Attention to Your Sentences

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

April 19, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link