May 17, 2019

Microsoft Applies Deep Learning to Vector Search

George Leopold

Since acquiring GitHub last June, Microsoft has sought to make good on its pledge to retain the project collaboration platform’s “developer-first ethos.” This week it turned over to GitHub an AI search tool as an open-source project.

The vector search approach encapsulated in an algorithm called Space Partition Tree and Graph attempts to address the reality that growing data volumes have made keyword search “brittle.” The algorithm takes advantage of deep learning models to search collections of information known as “vectors” in milliseconds.

“As deep learning became more prevalent, we applied it to some of these problems keyword search wasn’t working for,” said Rangan Majumder, Microsoft’s group program manager for Bing search and AI.

The goal is to deliver relevant results faster using vectors, or numerical representations of a data point, word or image pixel. Hence, Majumder said his team built its vector search platform to executive search queries more efficiently.

Deep learning models were applied to those vectors to better understand and represent the intent of a search, addressing ambiguities such as words with different meanings. In the search of searches, researchers also discovered after analyzing logs that searches were getting progressively longer, indicating frustration with conventional keyword searches and what Majumder suspects were users “trying to act like computers.”

That, he added, largely defeated the purpose of a search engine: a relevant answer delivered quickly.

Along with the application of deep learning models, the vector search initiative involved more than 150 billion pieces of data indexed by the search engine to improve the traditional matching of key words. To hardened the otherwise brittle approach, the indexed data included web page content, full queries and other media types along with characters and single words. The search engine then scanned the indexed vectors to come up with a relevant match.

While the “vectorizing” of search data and other media is not new, Microsoft NASDAQ: MSFT argues that it can only be scaled using massive search engines like Bing and Google (NASDAQ: GOOGL). Those search engines process billions of documents daily, and “the idea now is that we can represent these entries as vectors and search through this giant index of 100 billion-plus vectors to find the most related results in 5 milliseconds,” added Jeffrey Zhu, program manager on Microsoft’s Bing team.

Along with faster documents searches, Microsoft said its GitHub code contribution could be used to scan audio snippets to identify and translate spoken language or to provide new apps that could help label images.

While consumer applications based on vector searches are likely to emerge first, the contribution of the Space Partition Tree and Graph algorithm to GitHub also is intended to expand the framework to broader, enterprise applications, the company said this week in a blog post.

Recent items:

Platform Combines Visual Analytics, Vector Mapping

Why Knowledge Graphs are Foundational to Artificial Intelligence

Applications: Artificial Intelligence, Research Analytics

Technologies: Frameworks

Sectors: Other, Retail

Vendors: GitHub, google, Microsoft

Tags: AI, deep learning, graph algorithms, keyword search, Rangan Majumder, vector search

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Microsoft Applies Deep Learning to Vector Search

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 22, 2024

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Microsoft Applies Deep Learning to Vector Search

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 22, 2024

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link