March 13, 2023

MIT Researchers Use Machine Learning to Speed Up Data Retrieval Hashing

Jaime Hampton

A multi-institutional team of researchers led by MIT has found a new way to speed up data retrieval in large databases using machine learning.

The researchers used machine learning to build better hash functions. Hashing is a core operation used in online databases to accelerate data retrieval using hash functions that generate code to identify where data is stored.

A problem with hash functions is that they generate codes at random, and two pieces of data are sometimes hashed with the same value, causing what is called a collision. Collisions occur when multiple data are indicated with the same hash value, leading to less efficient searches. While there are specific kinds of hash functions designed to lessen collisions, they are laborious and require more time to write.

To reduce collisions for certain cases, the research team trained machine learning models created by running an algorithm on a dataset to capture specific characteristics, according to an article from MIT News. The team found that these models were more computationally efficient than other hash function types.

“What we found in this work is that in some situations we can come up with a better tradeoff between the computation of the hash function and the collisions we will face. In these situations, the computation time for the hash function can be increased a bit, but at the same time its collisions can be reduced very significantly,” said Ibrahim Sabek, a postdoc in the MIT Data Systems Group of the Computer Science and Artificial Intelligence Laboratory (CSAIL), in an MIT News article.

The research team says it wants to use machine learning models to design hash functions for other types of data and plans to explore learned hashing for databases in which data can be inserted or deleted, says MIT News.

“We want to encourage the community to use machine learning inside more fundamental data structures and algorithms. Any kind of core data structure presents us with an opportunity to use machine learning to capture data properties and get better performance. There is still a lot we can explore,” Sabek said.

To read more about the technical specifics of this new hashing method, read Adam Zewe’s coverage for MIT News at this link, and read the scientific paper here.

Algolia Acquires Search.io to Enable Users to ‘Search As They Think’

MIT Researchers Tackle Time Series Anomalies with Generative Adversarial Networks

Applications: Data Management

Technologies: Frameworks

Sectors: Academia

Vendors: MIT

Tags: database query, hash functions, hashing, machine learning, machine learning models, MIT, research

MIT Researchers Use Machine Learning to Speed Up Data Retrieval Hashing

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

June 30, 2025

June 27, 2025

June 26, 2025

June 25, 2025

Sponsored Partner Content

AI That Knows Your Business: Meet Cube D3

Mainframe data: A powerful source for AI insights

CData recognized in the 2024 Gartner ® Magic Quadrant™ Report

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Transforming Healthcare with Data

IDC Spotlight: Boosting AI Impact with Data Products

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

MIT Researchers Use Machine Learning to Speed Up Data Retrieval Hashing

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

June 30, 2025

June 27, 2025

June 26, 2025

June 25, 2025

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Share

Copy short link