How Big Data Tech Is Bridging the Analytical-Transactional Divide
Time was, production IT systems lived in two main camps: analytical and transactional. Transactional systems (OLTP) ran in real-time on relational databases, and analytical (OLAP) systems ran in batch mode on more exotic gear. But now, the rise of big data and NoSQL technologies is eliminating the gap and blurring the lines between what is analytical and what is transactional.
The OLTP-OLAP boundary had a good reason for existing in the past. If you were a traditional bricks-and-mortar type of company–such as a retail chain, a trucking outfit, or a manufacturer–you wouldn’t want to slow down your production ERP system by running a big reporting job at the same time. Instead, the massive SQL queries were run during nights and weekends, or shunted over to a dedicated data warehouse where they could run 24/7.
But times are changing. The advent of more powerful hardware and cheaper storage on the one hand, and more sophisticated applications and new business models on the other, are leading some to question whether the old dividing line still plays a useful role.
It does not, according to Nikita Ivanov, founder & CTO of GridGain Systems, developer of an open source in-memory data grid, who says the political rivalry that separated analytics and transactional systems is doing more harm than good.
“Companies are starting to realize that there are no more tangible reasons to separate the two paradigms so starkly,” Ivanov says. “We have a number of proof accounts going and projects in the pipeline where we’re dealing with customers that are trying to combine transactional and analytical process in one system.”
GridGain’s in-memory data grid software, which was released to the open source community last fall as Apache Ignite, can be used to accelerate just about any kind of application, including transactional systems built on standard SQL database or newer NoSQL database, and analytics apps designed for Hadoop. As long as there’s enough RAM available to store all the data and the app is not CPU-bound, GridGain’s software can boost performance significantly, sometimes by several orders of magnitude.
The issue is, today’s newer apps—especially consumer facing Web and mobile apps–often do not fit neatly into a given category, and combine elements of both disciplines. The best example may be an ecommerce application that obviously has a transactional aspect to it, but can also benefit from analytical elements, such as a recommendation system. Companies that have figured out how to serve personalized recommendations in real time have a much better chance of boosting their revenue.
In practically every industry, there are concrete examples of how analytic elements, such as personalized recommendations or pattern detections, can boost transactions. But alas, that old political line is still dragging us down into separating the two, says Ivanov.
“That’s where there’s an absolutely systemic inefficiency in a majority of businesses, for no reasons other than historical and political reasons,” he tells Datanami. “Midsize and especially large companies have these massive walls between the transactional side of the business that operates the business, and analytical and business intelligence. There’s absolutely no reason why it should be this way.”
The same old wall is evident to Emil Eifrem, the CEO and co-founder of graph database maker Neo Technology. Graph databases are rapidly gaining steam for a number of new workloads that sit at the traditional junction of transactional versus analytic computing.
“My theory is that analytics was never defined actually, and the way we thought, as an industry, about analytics was basic lay all the things I can’t do in real time on a relational database,” Eifrem says. “So there’s a bunch of queries that I can’t run and special reports that I can’t do on my OTLP database that’s serving my application and directly serving my customers in real time, so I’m going to do some ETL over into this other data warehouse ting where I’m going to do some cubes, some star schema, some data marts—whatever you want to call them–and I ‘m going to crunch the data in some other ways and I’m going to get reports. And the operations that I was not able to do in real time, is over time what we learned to think of as analytic operations.”
According to Eifrem, the rise of new architectures like in-memory graph databases and new data types such as JSON are aligning to where, for the first time ever, the shape of the data matches the shape of the database. That’s enabling organizations to do things that previously were not possible.
“All of a sudden, you have a bunch of new operations that you didn’t used to be able to do in real time that you can now do in real time,” Eifrem tells Datanami. “You typical think of those operations as analytics, whereas now all of a sudden you do them in real time. And that is sort of the root cause for this confusion” over the divide between transactional and analytics.
To be sure, there are limits to what graph databases can do. They’re not magical silver bullets, as Eifrem points out, and if you’re trying to analyze every node in a graph that has billions of data points, you’d best expect that to run for a while.
But for many types of analytic workloads that were previously run in batch, the advent of new data architectures—including SQL-on-Hadoop, NoSQL, and NewSQL databases–may well lead us to bid farewell to the old line keeping transactions and analytics separated, and say hello to the nirvana of big data: real-time analytics.