

Hortonworks today announced the acquisition of Sequence IQ, a Hungarian developer of cloud deployment automation tools for Hadoop. Hortonworks, which is hosting its Hadoop Summit Europe in Brussels this week, also shipped a maintenance release of its Hadoop distribution that includes Apache Ambari 2.0 and officially adds Apache Spark to the mix.
One of the big challenges customers face when implementing a modern Hadoop cluster is just getting it up and running. It’s not such an issue in small clusters that have less than 10 nodes. But as the cluster extends beyond 100 or 1,000 nodes, it quickly becomes too expensive and tedious to do the work manually.
A number of companies and open source projects are chasing this problem, including Sequence IQ. From its headquarters in Budapest, the company has developed a pair of products that fit into this space.
The first is Cloudbreak. This product makes it much easier for customers to provision and deploy Hadoop clusters, whether they live in the cloud (Amazon AWS, Microsoft Azure, and Google cloud are supported), in Docker containers, or running on bare metal. The software uses the “blueprints” functionality in Ambari, the Hadoop operations and management console, to enable users to easily duplicate customers’ Hadoop setups; support for OpenStack is on the horizon.
The second Sequence IQ product that caught Hortonworks‘ eye is Periscope, which provides auto-scaling capabilities for Hadoop clusters. The software, which also is integrated with Ambari, analyzes various performance metrics for the cluster, and automatically adds nodes as needed, based on policies set by the user.
Cloudbreak and Periscope fit very nicely together and complement the power of Ambari to streamline the management of Hadoop clusters, says Hortonworks vice president of product management Tim Hall. “You can imagine folks spinning up a 1,000 node cluster. That’s a lot of work to do. Just doing one step on every machine is too many,” he says. “So whatever we can do to help streamline automation, that’s been the focus around Ambari.”
Ambari has matured over the past year and gained more powerful capabilities, including Blueprints extensibility mechanisms and the new Alerts framework in Ambari 2.0, which Periscope can use to trigger the addition of Hadoop nodes.
“We’re seeing evidence that the community is understanding how to take advantage of the Blueprint extensibility mechanisms and really leveraging it to the maximum capability, which is awesome,” Hall says. “The Sequence IQ team has been wonderful to collaborate and work with and we’re excited to bring them into the family.”
Hortonworks plans to contribute the Sequence IQ products back to the Hadoop community, either by donating the intellectual property (IP) to an existing Apache Software Foundation project–potentially Apache Ambari itself–or by incubating a new one, Hall says. In any event, the Sequence IQ capabilities will be added to a future release of Hortonworks Hadoop distribution, but only for customers who have purchased Enterprise Plus support subscriptions for HDP, he says.
Hortonworks is happy to add the Sequence IQ team to its existing staff, and is eager to use its existing business to help it create a “beachhead” in Europe, Hall says. Looking forward, Hortonworks will be looking at ways to leverage some of the work Sequence IQ is doing with integrating Hadoop into OpenStack.
“Originally the integration between OpenStack and Hadoop was through the Sahara plugin. That worked great when Hadoop was MapReduce and HDFS as one project,” Hall says. “What we’ve seen now over time is, as they’re as more and more componentry in the Hadoop ecosystem that’s being deployed as a platform, it was causing a ripple effect in terms of what needed to be exposed and managed through that Sahara plugin.”
Instead of creating dozens of individual integration points between Open Stack and Hadoop for all the various Hadoop processing engines a customer might use–Hive, HBase, Cassandra, Spark, MapReduce, Tez, etc.–the Sequence IQ team is taking a different approach, and using Ambari’s blueprint API as the starting point for defining how Hadoop will deploy within OpenStack.
“From our perspective, the approach that the Sequence IQ team is taking makes more sense, given the way the Hadoop ecosystem is headed,” Hall says. “Part of it has to do with what is the right binding point into the OpenStack infrastructure for deployment of Hadoop…If it was just one Hadoop project being bound to Sahara and integrating into Open Stack, that makes some sense. But when you’re talking about 20 other components now that makes up the platform, exposing the details of all 20 of those things into the Sahara plugin didn’t make sense architecturally. The churn that comes along with what’s happening through inclusion or removal of various points, just meant that there was going to be a lot of investment in the Sahara plugin to keep up.”
Hortonworks also announced the first maintenance release for HDP 2.2, which shipped last fall. The new release includes Ambari 2.0, which Hortonworks unveiled last week, in addition to various other enhancements, including support for Apache Spark version 1.2.1. It’s the first time Hortonworks has official supported the popular in-memory computing framework in its Hadoop distribution.
The company also proposed a new Apache project for the Data Governance Initiative that it started earlier this year. Apache Atlas, as the DGI project would be known, aims to help rein in some of the data chaos that occurs on Hadoop. Specifically, Atlas will provide data classification, centralized auditing, search and lineage capabilities for Hadoop, as well as security and policy engines.
The new HDP release also includes Apache Ranger, a security management tool for Hadoop that came out of Hortonworks previous acquisition of XA Secure. Hortonworks has also streamlined the deployment of the Kerberos authentication subsystem in HDP; it can now be up and running in just a few clicks, the company says.
Related Items:
Taming the Wild Side of Hadoop Data
Hadoop Hits the Big Time with Hortonworks IPO
Hortonworks Goes Broad and Deep with HDP 2.2
June 30, 2022
June 29, 2022
- Lightbits Raises $42M in Growth Capital
- TigerGraph Launches New Version of TigerGraph Cloud
- Immuta Adds Policy Enforcement to Unity Catalog in the Databricks Lakehouse Platform
- DataStax’s Astra Streaming Goes GA With New Built-in Support for Kafka and RabbitMQ
- Ocient Partners With Carahsoft
- Timecho, Founded by the Creators of Apache IoTDB, Raises Over $10M
- Acceldata to Enhance Data Reliability with Databricks Integration
June 28, 2022
- Micron Delivers 176-Layer NAND SATA SSD for Datacenters
- Databricks Announces Major Contributions to Flagship Open Source Projects
- Sigma Computing Partners with Databricks to Bring No-Code Analytics to the Data Lakehouse
- Opaque Systems Raises $22M Series A To Bring Scalable, Multi-Party Analytics and AI to Confidential Computing
- MinIO Partners With Snowflake to Deliver Multi-Cloud Data Accessibility
- Cloudian Partners with Vertica to Deliver On-prem Data Warehouse Platform on S3 Data Lake
- Kyligence Introduces an Intelligent Metrics Store to Democratize Data Analytics
- Databricks Unveils New Innovations for Its Data Lakehouse Platform
- Fivetran Named Databricks Data Ingestion Partner of the Year
- Datadobi’s StorageMAP Now Integrated with Amazon FSx for NetApp ONTAP
- ThoughtSpot Report Finds Companies That Embed Analytics With a Differentiated UX Increase ROI
- Qumulo Named HPE Global Storage Partner of the Year
Most Read Features
- A/B Test Like You’re Airbnb
- Databricks Opens Up Its Delta Lakehouse at Data + AI Summit
- Artificial Intelligence and Machine Learning Are Headed for A Major Bottleneck — Here’s How We Solve It
- Europe’s New AI Act Puts Ethics In the Spotlight
- Snowflake Unveils Native Apps, UniStore, and More Python Support at Summit
- What’s Driving Data Science Hiring in 2019
- A Culture Shift on Data Privacy
- Data Mesh Vs. Data Fabric: Understanding the Differences
- Inside the Modern Data Stack
- Big Data File Formats Demystified
- More Features…
Most Read News In Brief
- EMR Serverless Now Available from AWS
- Google Debuts LaMDA 2 Conversational AI System and AI Test Kitchen
- OpenAI’s DALL·E 2 Is Surreal
- Samsung to Ship Next-Generation Smart SSD This Year
- Airflow Available as a New Managed Service Called Astro
- DataStax Nabs $115 Million to Help Build Real-Time Applications
- DataRobot Introduces Expanded AI Cloud Capabilities and Tools
- Data Quality Study Reveals Business Impacts of Bad Data
- McKinsey Acquires Data Engineering Pioneer Caserta
- Google Suspends Senior Engineer After He Claims LaMDA is Sentient
- More News In Brief…
Most Read This Just In
- Databricks Introduces Data Lineage For Unity Catalog
- Snowplow Closes $40M in Series B Funding
- GigaOm Benchmark Study Names SingleStore Best Database
- Precisely Launches New Data Integrity Suite
- MariaDB and MindsDB Raise the IQ for Cloud Databases
- Databricks Unveils New Innovations for Its Data Lakehouse Platform
- Exabeam Partners with Google Cloud
- Teradata Recognized as a Leader in a 2022 IDC MarketScape Report
- Scality RING Achieves Milestone Disaster Recovery Access with Major US Bank
- Prophecy Launches Low-Code Platform for Databricks
- More This Just In…
Sponsored Partner Content
-
Everyday AI, Extraordinary People
-
Dataiku Makes the Use of Data and AI an Everyday Behavior
-
Data Fabrics as the best path for Enterprise Data Integration
-
Dataiku connects data and doers through Everyday AI
-
Leaving Legacy ETL Behind
-
Streamline Lakehouse Analytics with Matillion and Databricks SQL
-
Close the Information Gap: How to Succeed at Analytics in the Cloud
-
Who wins the hybrid cloud?
Sponsored Whitepapers
Contributors
Featured Events
-
CDAO Government
September 13 @ 1:00 pm - September 14 @ 5:00 pm -
CDAO Fall
October 10 - October 12Boston MA United States