Cloud Data Warehouse Migration a Group Effort for UK’s Co-op
It’s not unheard of for customers to spend several years and tens of millions of dollars trying to migrate from one data warehouse to another. It’s also not uncommon for these exercises to fail. But for the Co-op Group in the UK, the migration from a Teradata warehouse to Azure SQL Data Warehouse not only came in on time and under budget, but also amid the worst pandemic in a century.
The Co-Op Group is a cooperatively run organization in the United Kingdom that was originally founded in 1844. Today, the Co-op (as it’s known in the UK) spans more than 7,000 locations across the country, providing an array of grocery, retail, insurance, legal, and funeral services. In 2020, it generated revenues of £11.5 billion (or $15.4 billion at current exchange rates) and had 4.3 million active members.
Like any modern organization, the Co-op relies on a mix of IT to automate its business processes. For years, it has used a Teradata data warehouse to inform decisions in its retail operations. The system held 25 years’ worth of data detailing the who, what, how, and when of Co-op’s customer activities.
“It was pretty much the engine that ran our retail operation,” says Co-op’s Director of Data, Digital and Loyalty, Charlotte Lock.
However, the on-prem Teradata system was approaching end-of-life for Co-op, and so the organizations data team started exploring new options in the 2018-2019 timeframe. The cloud loomed large for Co-op, as it has for many companies that are looking for more flexibility and the power to scale up their data warehousing operations in a short amount of time.
“The decision to migrate was to take advantage of the cloud,” Lock tells Datanami. “So really, processing speed, cost, and storing the data that’s been maintaining on prem systems, but also to be able to crunch millions of rows of data–billions of rows of data sometimes–in a fraction of the time that we would have needed to do that in our on-prem solutions.”
Then COVID-19 hit in early 2020, and that ramped up the time horizon considerably, Lock says. If the Manchester-based organization was considering a leisurely five-to-six year migration before COVID-19, that certainly was no longer part of the plan afterward.
“We could see even before COVID–and of course COVID accelerated it–that the requirement for even more data to be able to perform real-time action as we tapped into, for example, e-commerce,” Lock says. “We didn’t really have a significantly e-commerce operation before COVID. But actually the demand meant that we had to set something up really swiftly.”
Part of the concern with the old system was that the Co-op lacked comprehensive views of its customers, which meant that customer activity in retail couldn’t easily be matched with customer activity in legal or insurance. That limited the cross-selling and up-selling opportunities that are so important for ecommerce operations, like the one that the Co-op was seeking to build in a short amount of time. It also limited the Co-op’s ability to operate in an omni-channel fashion, which is where it was headed.
“Where organization that had been predominantly face to face…more and more, we’re seeing our customers wanting to transact through omni-channel,” Lock says. “It’s the coincidence of moving to the cloud with our business ambition of becoming more digital and more personal.”
The combination of these factors–the greater processing flexibility of the cloud and COVID-‘s omni-channel demands–drove Co-op to the decision to retire its Teradata system and replace it with Microsoft Azure Synapse Analytics in early 2020, Lock says.
However, considering how critical the Teradata system was to the Co-op’s retail operations, it couldn’t just pull the plug. It also couldn’t languish with a forever migration program that may or may not ever fully pan out. That’s when the company considered bringing in another player to help with the migration: Datometry.
Datometry is a San Francisco company that develops a database virtualization product that can make one database look like another. Its emulator, called Hyper-Q, was developed to help companies migrate off two of the most popular analytics databases on the market: Teradata and Oracle.
Datometry CEO Mike Waas described how Hyper-Q works in a July interview with Datanami:
“We intercept the communicating from the application, unpack the requests, take out the SQL, and then do what effectively is almost like what the upper half of any database does, meaning build an entire algebraic model for the incoming request, optimize that, and then synthetize what the optimized SQL means for that destination,” Waas said.
This was just what the doctor ordered for the Co-op. Since it didn’t have to rewrite its Teradata code, the organization could essentially just take all 10,000 tables and 800 to 900 automated processes that it has developed for Teradata, and move them directly into Azure Synapse, and let Hyper-Q handle any minor incompatibilities or incongruencies between the two systems.
“It does virtualize it to a pretty successful degree, which means that we were able to maintain our Teradata system,” Lock says. The Datometry software allowed Co-op to “build this emulation layer in parallel, and then be able to close down Teradata and effectively have the same code running.”
The actual migration from Teradata to Azure Synapse was done in batches, starting with the biggest, most complex batch runs. After thorough user acceptance testing (UAT) to make sure the new system is giving the same answer as the old system, and is meeting service level agreements (SLAs), it would go on to another batch, and so forth, until the whole thing was moved.
“We invested quite a lot in quality assurance, because what you don’t want to do is kind of switch off the backup plan until you’re really, really confident,” she says. “So we went over and over and over.”
With assistance from the Microsoft Azure team and Datometry, the Co-op was able to complete the migration to Azure Synapse Analytics in just a matter of months. It is now running its data warehouse fully in the Azure cloud. The Teradata system has yet to be decommissioned, but it’s no longer in the loop, Lock says.
While it didn’t need to, the group took the opportunity to repoint about 15% of the Teradata code to run more efficiently in the Azure Synapse environment, Lock says. “Not because it wasn’t working in the engineman necessarily,” she says. “But because the teams are spending an awful lot of time in that data, and actually when those processes were written in the first instance, it was a different environment.”
The return on investment on the migration has been good, according to Lock, who says the group has saved millions of pounds by running in Azure versus running with Teradata on prem. The ability to ramp up data crunching operations to ensure that reports (designed in PowerBI or MicroStrategy) are ready to go and on analysts desks at 7 o’clock Monday morning is another factor altogether.
“In retail, you need to have data at your fingertips. It’s a very real and responsive environment,” Lock says. “And it really meant that that lack of service interruption [and] the ability to speed things up is a huge advantage for the efficiency of the data team. It didn’t necessarily mean that our stakeholders saw a huge advantage. It meant that our team could run more efficiently.”
The fact that Co-op was able to complete the migration on time and budget during the COVID-19 lockdown is something that Lock says she is very proud of. When the project was completed, some of the team actually got together, which made it extra special.
“There haven’t been any big dramas. I haven’t had to ask for any more budget. I haven’t had to extend the team in order to deliver the transformation,” she says. “So it’s been a very kind of smooth process, which I think is partly testament to the partnership with Microsoft and Datometry. But also I think also you know being realistic about the amount of time the UAT needed to take, not trying to transition and migrate too soon. I think that’s the critical thing.”
Related Items:
Is Now the Time for Database Virtualization?
Microsoft Debuts AI-Based Semantic Search on Azure