Follow Datanami:
March 14, 2017

MapR Extends Its Platform to the Edge

MapR Technologies today unveiled MapR Edge, an extension of its converged data platform that lets customers install MapR nodes practically anywhere they want.

The new offering runs on small portable PCs like the Intel NUC (pictured above), and delivers the full breadth of MapR’s capabilities–including Hadoop, NoSQL, and data streaming functionality—anywhere customers want, from autonomous cars driving rural highways to wellheads in the oil field.

“Things are getting more distributed, not less distributed,” says Jack Norris, MapR‘s senior vice president of data and applications. “The benefits of having processing closer and closer to the data and being able to act faster where the action is happening, is a big driver.”

MapR Edge pushes data collection and processing capabilities further away from the big centralized clusters that so far have largely defined big data platforms like Hadoop, NoSQL databases, and streaming data platforms Kafka. But instead of creating a separate system that must be configured and managed, MapR decided to make it all part of the family.

“This is not a separate standalone product that just has data collection,” Norris tells Datanami. “It’s actually a full extension of the cluster, so [it’s providing] centralized management, centralized security. [It has] the ability to replicate, the ability mirror, the ability to handle occasional connected devices with streams. It’s all built into the MapR Edge.”

The new offering fits into MapR’s strategy to help customers build Internet of Things (IoT) applications. To that end, it serves several functions.

First, it serves as the first waypoint for data right after it’s generated. As raw flows off wellheads or MRI machines, MapR Edge collects it and performs the first round of processing. The customer then has the choice to upload only the aggregated cluster to the core MapR clusters for further analysis or archiving. This can help alleviate both bandwidth and data privacy and security concerns.

But MapR Edge goes beyond that and pushes machine intelligence out into the field. For example, an oil exploration company with thousands of wellheads may have used machine learning algorithms to predict when equipment is about to fail. That signature of equipment failure can be pushed out to the MapR Edge to score streams of live data in real time.

“This whole concept of act locally, learn globally is really what’s driving some of the closed loop processes,” Norris says. “Each individual unit is only seeing the data from that particular wellhead. But when you’ve got thousands of those throughout world and you have data that’s been collected over a period of time, the ability to detect infrequently occurring events — the ability to detect anomalies – is much better understood on global basis.”

As a full-fledged member of the MapR clusters, MapR Edge can run any big data processing engine supported by the Hadoop distributor, including Spark, Drill, Hive, MapReduce, and others. The software can also function as a node of MapR’s NoSQL database, called MapRDB, and also be a node in MapR’s Kafka-compatible stream processing system, called MapR Streams.

MapR Edge can run on the Intel NUC, a miniature PC that’s only 4.5 inches by 4.5 inches in size.  The minimum configuration calls for a cluster of three Intel NUCs, each configured with 16GB of RAM and 64GB of solid-state storage. The maximum configuration is a cluster of five MapR Edges, and total of 50 TB of storage.

Related Items:

MapR Embraces Microservices in Big Data Platform

MapR Delivers Bi-Directional Replication with Distro Refresh

Datanami