April 16, 2019

Microsoft Expands Hadoop on Azure

Staff report

Microsoft has upgraded its open source analytics services running on Azure with a new version of Hadoop incorporating enhancements of Apache Hive and other open source analytics frameworks.

The software giant (NASDAQ: MSFT), which completed its blockbuster acquisition of GitHub last October, continued its push into open source with this week’s release of Hadoop 3.0 on its Azure HDInsight analytics service. The release incorporates upgrades to Hive, including its data warehouse “connector” for Apache Spark, as well as new versions of HBase and Phoenix.

Lastly, Microsoft said Monday (April 15) its cloud-based Hadoop service integrates Spark IO cache, HDInsight’s data caching service designed to accelerate workloads running on Apache Spark clusters.

The Hadoop upgrade represents Microsoft’s ongoing efforts to boost support for big data analytics applications on its Azure cloud. Microsoft is positioning its Hadoop 3.0 distribution as an “enterprise-ready service for open source analytics” that can run Spark, Kafka and others open- source apps. Those tools can be used for data ingestion, preparation and management along with analytics, business intelligence and data visualizations.

Microsoft promotes the Hadoop and other open source upgrades as a means of boosting the performance and availability of analytics applications running on its cloud. For instance, the addition of the latest version of Hive data warehouse software to its HDInsight service targets developers seeking to build “traditional database” applications on data lakes. The company touts that capability as helping to build big data applications that comply with data privacy rules.

Meanwhile, the Hive warehouse connector for Spark underscores how the analytics tools are merging. The link is intended to advance that integration to the query engine level, the company said.

The upgraded HBase non-relational distributed database is designed to reorganize data in a memstore data-write buffer, thereby boosting performance by reducing reads of data stored remotely in the cloud. The accompanying Apache Phoenix relational database engine that supports transaction processing on Hadoop, bringing “more visibility into queries,” Microsoft noted in a blog post. The upgrade also provides details about queries being run against a cluster.

The data caching service also now available on Azure HDInsight seeks to boost the performance of Spark, Hive and Apache TEZ workloads. All can be run on Spark clusters.

HDInsight also supports a growing list of big data applications that included Kyligence, the analytic processing engine base on Apache Kylin, and the WANDisco data-migration tool used with cloud-based Hadoop and Spark deployments.

Recent items:

Microsoft Azure Data Warehouse Gets a Tune Up

From Big Beer to Big Data: Inside AB InBev’s Digital Transformation

Applications: Enterprise Analytics, Security, Visualization

Technologies: Cloud, Frameworks

Sectors: Financial Services, Other

Vendors: Microsoft

Tags: analytic cloud, azure, Azure HDInsights, Hadoop, HBase, Hive, Spark

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Microsoft Expands Hadoop on Azure

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 25, 2024

April 24, 2024

April 23, 2024

April 22, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Microsoft Expands Hadoop on Azure

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 25, 2024

April 24, 2024

April 23, 2024

April 22, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link