August 14, 2020

MIT Is Developing a Tool for Machine Learning-Powered Data Retrieval

Oliver Peckham

With the global deluge of data, the opportunities are endless – but so are the challenges. Within five years, the world’s data is estimated to reach 175 zettabytes: enough to fill over 23,000 one-terabyte hard drives for every single person alive. In the context of such a data-driven world, managing and sorting through that data is a task that gets harder by the day, with database and query managers struggling to keep up. Now, researchers from MIT are developing a tool to intelligently assist users of large databases.

“It’s like building a database system for every application from scratch, which is not economically feasible with traditional system designs,” explained MIT Professor Tim Kraska in an interview with MIT’s Adam Conner-Simons. Kraska and his colleagues – from the institute’s Computer Science and Artificial Intelligence Laboratory (CSAIL) – are debuting a design for what they call “instance-optimized systems”: database systems that are able to optimize and reorganize themselves in response to the data types and workloads at hand.

MIT’s instance-optimized system will be the child of two parents: the “Tsunami” and “Bao” tools. Using machine learning, Tsunami (a successor to “Flood”) interprets user queries to reorganize the layouts of databases. Bao, meanwhile, uses machine learning to intelligently pick the appropriate plan for completing a given query. On their own, Tsunami improved query speed up to tenfold, while Bao-created query plans ran up to 50% faster. When combined: the instance-optimized system.

“Query optimizers have been around for years, but they often make mistakes, and usually they don’t learn from them. That’s where we feel that our system can make key breakthroughs, as it can quickly learn for the given data and workload what query plans to use and which ones to avoid,” Kraska said. “Our hope is that a system like this will enable much faster query times, and that people will be able to answer questions they hadn’t been able to answer before.”

The team is still working to integrate the two tools, but are already having luck training Bao, with the tool outperforming commercial tools with as little as one hour of training. The researchers are hoping to bring this success, and others, to resource-limited systems like cloud environments where query optimization could have a particularly large impact.

“I think this line of work is a paradigm shift that’s going to impact system design long-term,” says Idreos. “I expect approaches based on models will be one of the core components at the heart of a new wave of adaptive systems.”

To read more, check out the paper here or the MIT news story here.

Applications: Artificial Intelligence

Technologies: Cloud, Middleware

Sectors: Academia

Tags: MIT

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

MIT Is Developing a Tool for Machine Learning-Powered Data Retrieval

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 22, 2024

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

MIT Is Developing a Tool for Machine Learning-Powered Data Retrieval

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 22, 2024

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link