Data Warehouse Modernization and the Journey to the Cloud
To say that organizations today are facing a complex data landscape is really an understatement. Data exists in on-premises systems and in the cloud; data is used across applications and accessed across departments.
Information is being exchanged in ever-growing volumes with customers and business partners. Websites and social media platforms are constantly adding data to the mix. And now there’s even more data coming from new sources such as the Internet of Things (IoT) via sensors and smart, connected devices.
This proliferation of data sources is leading to a chaotic, “accidental architecture”, where organizations can’t get the right data to the right people at the right time. That means users such as business analysts and data scientists can’t adequately analyze relevant data and get the most value out of it to enhance the business.
While they’re dealing with growing sources and volumes of data that are increasingly difficult to manage, enterprises are also grappling with emerging business demands:
- Increasing expectations, both internally and externally: They’re expected to be more agile, with faster time to market for products and services and more rapid response times for customer inquiries. Any delays can mean the loss of business to competitors that are more agile.
- Constant need for changing compliance: They need to comply with a growing number of government and industry regulations related to data security, privacy, and management, which is part of the broader issue of data governance. The latest example is the new GDPR regulation governing data privacy for citizens of the European Union—or the new California Consumer Privacy Act.
- Growing demand for accessible data: They must meet growing demands for self-service, as more and more business users demand immediate access to data and to the tools to analyze the data. A new generation of workers expects to have continuous access to the resources they need.
Data Warehouses Moving to the Cloud
Not too long ago, many were saying these challenges could be addressed by data warehouses. But traditional data warehouses present a number of problems. For one thing, they lead to the existence of data silos, with companies using a datamart for one project, a data warehouse for another project, and other warehouses for still other projects.
This, in turn, increases complexity and makes data management more difficult. The challenge becomes even greater as the volume and sources of data increase. There are multiple systems to integrate and manage, which requires specialized skills and tools.
And in order to meet performance and capacity demands, organizations need to make investments in all kinds of proprietary hardware and then maintain that legacy hardware over time. Companies are forced to do a lot of capacity planning and try to control costs while dealing with the rapid increase in data.
Perhaps because of these shortcomings, many organizations are looking to make changes regarding their data warehouse strategy. Research conducted in 2017 by the Data Warehouse Institute shows that nearly half of the organizations surveyed (48%) are planning a replacement project for their data warehouse platform by 2019.
A lot of these organizations are moving to cloud-based data warehousing, which gives them virtually unlimited capacity and scalability, a more economical way to leverage warehousing, and in many cases cost savings.
A move to cloud data warehousing has its own set of challenges, however. When companies make this shift, they’re not just moving their databases to the cloud, but analytics and visualization as well. They’re transitioning to business intelligence as a service. So, one of the major issues that arise is data integration.
Organizations that are moving their data warehousing initiatives to the cloud and using integration tools however are seeing benefits. Among the two key use cases are lift and shift—where a company is taking an existing legacy data warehouse and moving it into a new, cloud-based data warehouse—and entirely new projects where the company doesn’t really care about the legacy data warehouse.
One example of a successful lift and shift strategy is a large financial and data services firm. The company had challenges with poor performance and meeting load times with a legacy data warehouse. In addition, its reporting functionality was taking too long for users to get results back.
With a large variety of data sources, the firm moved off its legacy platform to modernize its data warehouse to ensure it could access all the connectors it would need now and in the future. Among the results of the move are nine times faster load performance and eight times faster query performance. This improved end-user efficiency enabled the IT organization to meet its load window SLA.
An excellent example of a new project use case is with a large provider of healthcare analytics that relies on timely and accurate data to provide insights and ultimately to improve efficiency in the healthcare chain. The company has to integrate and manage a great number of diverse healthcare data sources which is labor intensive and time-consuming. It needed to build a big data warehouse for faster analytics.
The company takes a lot of healthcare data, bundles it up, enriches the information and provides it back to pharmaceutical, biotech, and medical technology companies. Today, the firm has to provide insights that are a lot more metrics-driven and therefore manage data at scale. What used to be about gigabytes of their own data has become petabytes of external/real-world data.
Data management was getting too complex and it was difficult to integrate all the various data sources. The company decided to move to a single repository where it could process data from many suppliers, as well as their own. Considering the sheer amount of data at play, they needed mature data management and quality capabilities without a “data tax” (a costly and unpredictable pricing model based on the number and type of connectors or source and target systems being connected.). Yet after moving to a cloud data warehouse, the company experienced robust connectivity and easy access.
For many companies, it’s not a question of if they’ll be moving their data warehouse function to the cloud, but when. It’s important for organizations to find new ways to work with data more efficiently. Without this, they can’t compete in today’s business environment.
About the author: Vincent Lam is head of cloud product marketing at Talend, where he focuses on growing the Talend cloud. Prior to that, Vincent was global head of corporate marketing for Protegrity and a marketing director at Information Builders. Vincent has a bachelor’s degree in computer and electrical engineering from Cornell University.