Follow Datanami:
March 3, 2021

Microsoft Debuts AI-Based Semantic Search on Azure

via Shutterstock

Microsoft has been steadily upgrading its enterprise search capabilities, recently targeting previously “unsearchable” unstructured data in the form of PDFs, Word documents, text files and JPEGs. The result was Azure Cognitive Search, a cloud-based service with built-in AI capabilities.

The company (NASDAQ: MSFT) has since gone a step further, integrating into its query infrastructure new semantic search capabilities developed by its Bing search team. The enhanced search feature would allow app developers, for example, to apply semantic search tools to in-house or managed content.

Microsoft touts the cloud-based service as combining search relevance with an improved development tools, including APIs and tools for scanning content in web, mobile and enterprise applications.

The new semantic search framework builds on Microsoft’s AI at Scale effort that addresses machine learning models and the infrastructure required to develop new AI applications. Semantic search is among them.

The cognitive search engine is based on the BM25 algorithm, (as in “best match”), an industry standard for information retrieval via full-text, keyword-based searches.

This week, Microsoft released semantic search features in public preview, including semantic ranking. The approach replaces traditional keyword-based retrieval and ranking frameworks with a ranking algorithm using deep neural networks. The algorithm prioritizes search results based on how “meaningful” they are based on query relevance.

Semantics-based ranking “is applied on top of the results returned by the BM25-based ranker,” Luis Cabrera-Cordon, group program manager for Azure Cognitive Search, explained in a blog post.

The resulting “semantic answers” are generated using an AI model that extracts key passages from the most relevant documents, then ranks them as the sought-after answer to a query. A passage deemed by the model to be the most likely to answer a question is promoted as a semantic answer, according to Cabrera-Cordon.

Source: Microsoft

Meanwhile, a “semantic caption” function extracts the most relevant section in a document. The passage can then be skimmed, making it easier “to triage the results briefly and go deeper,” he added.

The Azure semantic search feature integrates what Microsoft estimates are hundreds of development years and millions of dollars in compute time amassed by the company’s Bing search team. The researchers noted they have relied on recent development in transformer-based language models to boost the quality of Bing search results.

As reported last month, market tracker Forrester identified transformer models as among its top five technologies for advancing AI technology. Microsoft’s AI framework is among the first to apply the approach to semantic search.

“These improvements allow a search engine to go beyond keyword matching to searching using the semantic meaning behind words and content,” Microsoft researchers added in a separate blog post explaining the science behind semantic search.

The AI-based approach involves pre-training large transformer deep learning models, fine-tuning them across a range of tasks, then distilling the models to a manageable size without sacrificing quality.

The resulting AI at Scale platform is promoted as package of models, software and hardware available on Microsoft Azure.

So far, the semantic search engine only supports U.S. English. Microsoft said it expects to add other unspecified languages soon.

Register here for a public preview of the semantic search engine.

Recent items:

One Model to Rule Them All: Transformer Networks Usher in AI 2.0, Forrester Says

Think Search is Solved? Think Again

Microsoft Launches Spatial Analytics, Other AI Services at Ignite