Probing Data’s Middle Ground
The rise of data lakes has in the process of consolidating data exposed a gap between databases, files and other data repositories and the ability to view, analyze and leverage those data.
Targeting that middle ground between where data resides and how it is viewed has emerged as a growth market for data analytics vendors. Among them is Cambridge Semantics, the company behind the Anzo data lake. It used a recent white paper to promote its “dark” data discovery and integration platform that uses its semantic approach to blend enterprise data from multiple sources to develop machine learning and business intelligence tools.
“The data lake movement has made clear that the consolidation of data, in and of itself, will not solve critical business needs,” the white paper notes. “The lacking component has been the appropriate array of lenses through which to view and better understand that data.”
Among Anzo’s selling points is its ability to ingest multiple data types stored in a variety of batch and streaming platforms. Those range from databases and Hadoop data lakes to in-house and cloud repositories.
Once organized, metadata and other data sources are ingested and mapped to a master graph.
The result is a “data fabric,” defined as a complete set of enterprise data regardless of where it is stored or queried. According to Cambridge Semantics, the fabric is intended to allow users to add layers to handle data preparation and semantic model adjustment.
The resulting “AnzoGraph” allows querying of in-memory knowledge graphs. This engine performs concurrent ad-hoc or analytic processing (OLAP) interactive or batch queries across data resources, providing what the company claims is in-memory performance at scale.
Indeed, Cambridge Semantics has been a leading proponent of the GOLAP data warehouse, so called because it provides a parallel graph analytics capability designed to accelerate the extraction of insights.
“The goal of the Anzo platform is not to boil the data ocean and accurately chart its every trough and trench,” the white paper explains. Instead, “it is to provide a mapping capability for the corporate data resource that can create rich data ‘products’ that that can reap dividends, particularly to analytics users.”
Anzo is also designed to provide an “overlay” of data resources that allows users to describe structured and non-structured data that become part of a knowledge graph. The mappings “can lift the data and describe it as a graph,” said Sean Martin, founder and CTO of Boston-based Cambridge Semantics.