Can We Say Goodbye to Data Silos?
The biggest headache in big data, arguably, is the proliferation of data silos and the need to integrate them. We spend billions of dollars and millions of person-hours stitching together disparate data sets to accomplish business goals, yet it never seems to be enough. But according to Cinchy CEO and co-founder Dan DeMers, we can avoid all the muss and fuss and achieve higher data productivity by adopting an architecture that’s radical in its simplicity.
DeMers makes a compelling cast that data integration is a thorn in the side of enterprises that threatens to get bigger and more painful with time. With no end in sight for the ongoing data explosion, the default idea that each application should own its own data represents a long-term threat to productivity, he says.
“When you’re building an application, you generally stand up an app-specific data store,” DeMers says. “You build an app-specific data model, and you implement a whole bunch of code to do basic data management, data persistence and transactions, and you have to worry about these things. Not only that, you have to figure out how you integrate data from other applications. How do I share data from my application to other applications that need it?
“This is a huge burden for every application that I ever need to create or change,” he continues. “Can’t I simply rely on another layer of abstraction? That’s the idea of Dataware.”
Dataware is the name of Cinchy’s platform, which has been adopted by more than 100 enterprises. It’s an abstraction layer that basically relieves other software from having to worry about how and where it’s storing data, according to DeMers.
“It’s essentially putting the apps in their place such that they no longer own data, they no longer trap data and silo data,” he says.
The Dataware concept originated in the 1980s, when computer scientists first started thinking about decoupling physical data storage from applications, DeMers says.
In 2011, Derek McAuley, Richard Mortier, and James Goulding published “The Dataware Manifesto,” which described a “logical federation of data sources” to deliver a consumer-centric view of data in a services oriented architecture (SOA) environment.
However, the Dataaware concept languished, as the technology wasn’t quite ready, and the data volumes were not as great. But conditions have changed, and the idea is gaining traction once again, DeMers says.
“It’s only now becoming both possible as well as increasingly mandatory because of the growth in the number of applications,” he says.
Fundamental Data Rethink
The Dataware platform requires a fundamental rethink on the relationship between data and applications.
As mentioned earlier, applications no longer “own” their own data in the Dataware version of the world. Developers no longer build the application atop a dedicated database, distributed file system, or object storage system. Instead, the applications are designed to access a single common data store defined by Cinchy’s Dataware platform.
“Think of this as a network where the data is connected, just like how the Internet is a set of services that are all linked and connected,” DeMers explains. “It’s the same idea. So it’s the idea that that connecting and linking shouldn’t require you to create a copy. If I create a web page and I have a hyperlink to one of your web pages, I don’t have to store a copy of your web page. I can point to it. That’s the idea. When my data references your data, is it should be a pointer. It should not be a copy.”
Another way to think about the Dataware strategy is to think about how devices use the Wi-Fi network in your home. “You don’t connect the devices to each other,” DeMers says. “You connect them each to one thing, which is your network. And that network is what facilitates the ability for each of those applications to interact, both view and change, data from the other applications, where the only constraint is the access. But now the access can be universally controlled.”
There are still different physical stores of data in the Dataware world, because the laws of physics still apply. Data does need to be moved to specific geographies to ensure fast, low-latency experiences for users. But those data stores would not be data silos, because they would be connected and managed by the Cinchy Dataware platform, DeMers says.
“If you’re building applications, you’re interacting with the Dataware layer for persistence, for transactions, for basically accessing data, storing data, changing data, and you’re doing that in lieu of standing up an app-specific data store,” DeMers says.
“When they’re building applications, they’re using Dataware via whichever protocol and format that they prefer,” he continues. “If it’s a database, they can do it as REST endpoint, and they can basically access the information and perform transactions and alter the information. They can do basic CRUD operation. They can do complex transactions.”
Dataware automatically handles the sharding of data to ensure it’s safe, handles the geolocation of data to ensure it’s accessible. It maintains availability, redundancy, backup, and versioning.
“It is that universal data layer,” DeMers says. “Think of it as a really doing what Google Drive does for files, but doing it for data on behalf of the applications.”
Beginning of the End of Data Integration
By his own admission, the Dataware concept is a long play, according to DeMers. It won’t radically transform enterprise data architectures overnight. Organizations that have already spent millions to build data lakes and data warehouses and keep them brimming with the latest data via data pipelines or ETL tools are not going to rip it all out and replace it with a new architecture.
DeMers understands this. “It’s not the instant elimination of all integrations, he says. “But it’s the beginning of the end of integration.”
Cinchy has more than 100 paying customers, including some sizable firms like PwC, TD Bank, and the YMCA. Companies that make the decision to adopt the Dataware platform do so with the understanding that it will take years to fully realize the benefits. It takes a philosophical commitment, he says, and a willingness to invest now with the promise of a bigger payoff down the road.
“When organizations decide to go forward with a Dataware approach, they’re not doing so for an individual business outcome or to build an individual business capability,” DeMers says. “It is part of their organizational strategy, where they’re essentially changing how they think about the role of data and moving data to be truly in the center, where they’re philosophically aligned with the idea of treating data like money.”
Just as people expect to be able to take their money out of one bank and deposit it in another, the same should apply to data, DeMers says. When the data is easily accessed by any application, the details about how it’s stored and managed no longer material to decisions about what to do with the data.
Cinchy’s goal is to help organizations transform themselves into being data-centric organizations. DeMers makes an distinction between being data-centric and being data-driven, which borrows heavily from Dave McComb’s work in the area.
“I see Dataware as very much a unified approach to fixing the root issue that is the very reason why we have things like data warehouse and data marts and data meshes and data fabrics and master data management and all these different technologies that each address a symptom, whereas by thinking differently about how you design applications in the future, you won’t need those workarounds,” he says.
Younger, smaller companies that haven’t yet built out elaborate data integration structures have the most to gain from adopting a Dataware architecture, DeMers says.
“Some of our customers that already have existing warehouses and existing lakes and lake houses etc. We would not really suggest that they look to replace that because it’s already in place. It’s already hydrated,” he says. “If you were a smaller organization or or an organization that was looking to rethink your analytics strategy, there’s your opportunity to leapfrog and essentially intercept the effort to create a central consolidated representation of data, but have it be not limited to solving for analytical and reporting use cases, to have it be such that it can even power operational use cases.”