Five Ways Big Data Analytics is Transforming Organizations
Organizations are no longer content waiting until tomorrow to know what happened two minutes ago, nor can they afford to wait. Many operational and mission-critical decisions rely on how fast and accurately they can analyze the growing amount of data streaming into the organization. As the architectural landscape becomes more complex, big data analytics professionals must find a way to wrestle insights from incompatible systems, despite the inconsistency of data. Adding more capacity, hardware and personnel is a poor workaround — it just adds a lot more cost and doesn’t necessarily translate into better, faster insights.
As organizations struggle to keep pace with the rapidly evolving technology, five trends are emerging within big data analytics that empower organizations to handle the volume, velocity, and location aspects of data so they can use it strategically.
1. GPUs to Handle Big Data Volume and Velocity
Big data analytics is pouring off real-time sources like sensors and devices, (cell phones, telematics data from cars, social media streams, server logs, and clickstreams). Much of this data requires immediate analysis, for valuable insights while the information is still relevant. Utility companies, for example, are gathering real-time insights from smart meters, to continuously balance the grid, prevent service interruptions, and reduce emissions.
Legacy systems, leveraging CPU-only architectures with low parallelism, struggle to keep pace with the volume and velocity of data. As a result, they force IT to keep adding more hardware, and hiring more data engineers to pre-aggregate or index the data, just to fit it in mainstream tools. This slows down the whole data pipeline, and limits the type and amount of insights available from the data.
Graphics processing units (GPUs), in combination with traditional CPU architectures, are now accelerating a new breed of high-performance database engines and visual analytics systems. These GPU-based solutions enable massive parallel processing, and can complete a query in milliseconds that would take hours on a legacy platform.
2. Operational Agility
In today’s high-speed world, most analysts do not have the familiar luxury of getting up to have a cup of coffee while they wait for their query to finish or their dashboard to refresh. While that query-then-wait experience is just plain frustrating, it has real value impacts too: when the analytics experience is too slow, users explore less and find fewer insights.
The trend is toward providing users with an extreme analytics experience, one with zero discernible latency when interacting with the data. Users start to love doing analytics again, and the organization benefits by moving toward a continuous analysis mode, versus one with daily, weekly or monthly analysis cycles.
3. Shifting Roles for Unified Understanding
BI initiatives will expand as more people within the organization discover the importance of the data and find strategic and competitive value in its application. As technology improves, the value of the data is extending beyond the IT department, as more teams within the organization seek a unified understanding of the most important operational challenges.
With better access to intuitive self-service platforms, job roles are shifting too (see point 5). The analyst role, for example, which was focused on the data warehouse and BI, is blurring with the data scientist role, which was traditionally focused on statistics and machine-learning.
As roles and use cases evolve and expand, end-to-end platforms improve the efficiency of marshalling and moving data between different siloed solutions. An integrated platform provides higher performance and is often easier to use, as it typically works out of the box, with integration already baked into the product.
4. The Increasing Importance of Location Insights
Sensors and devices in mobile objects, such as phones, cars, trucks, shipping RFID tags, social media, and satellites capture and transmit data on many important processes, and most of that data today is enriched with location and a time attributes. Being able to visualize data within the context of time and location, and then display it on interactive maps, makes it easier to recognize patterns, understand complex historical relationships, and anticipate future events. Unlike traditional Geographic Information Systems (GIS), most location-based geoanalytics are lightweight, involving simple geospatial filtering and joins, and visualization with cross-filtering between traditional BI visual elements like bar, line and pie charts.
Geospatial analytics is difficult at scale, because they occur in two or three dimensions and are compute intensive. Visualizing geospatial data with granular detail can overwhelm both analytics servers and the network connections between server and client. Traditional GIS platforms can’t handle the mainstream analytics components, and traditional BI platforms can’t handle the lightweight geospatial analytics needed; neither is fast enough to be interactive at scale.
Back to point No. 1: GPUs change all that. They provide the computational horsepower required to do complex geospatial analytics queries, alongside traditional analytic SQL. At the same time, GPUs offer the rendering capabilities needed to visualize granular and large-scale geospatial data and to make the experience interactive.
5. Demand for Self-Service Analytics Platforms
As data becomes more available and analytic literacy more pervasive, organizations are focusing more on self-service analytics platforms that enable users to access the data independently. New ways for business users to ask and answer their own questions with data, and then share their insights through visualizations, are changing the way that teams work together. The second wave of BI vendors claimed that they had ushered in a new era of “self-service analytics,” in which end users could finally be free of relying on IT to prepare and analyze their data. They offered users the ability to build their dashboards, and in certain limited cases even prepare their own small datasets. However, very large datasets, stored in the data warehouse, still required a ticket to the IT team, which could take days or weeks to fulfill.
New platforms, particularly those that bridge the previously separate domains of data warehousing, GIS and BI, allow users to more freely manage, query and visualize the data. High-performance GPU-based systems eliminate the need for indexing, pre-aggregation, and down sampling of the data, making data loading and preparation a far simple and faster process.
Data alone will not help us to make smarter decisions. These five areas of transformation, and the new applications that they give rise to, can help organizations achieve their aspirations to drive decisions based on big data analytics. Those that make the shift will be tomorrow’s leaders of industry and government.
About the author: Todd Mostak is the CEO and co-founder of MapD Technologies. Todd originally conceived the idea for MapD while doing graduate research at Harvard. He later joined MIT’s CSAIL as a research fellow, under the supervision of Sam Madden and Turing Award winner Michael Stonebraker, before founding MapD.