Data Warehousing with a Modern Twist
Bill Inmon is generally credited with inventing the phrase “data warehouse” in the early 1990s to describe the stockpiling of data using relational databases. It may be an older term, but the activity itself remains quite relevant today, especially considering the huge amounts of data we generate every day.
However, some of the elements of data warehousing implementations have changed considerably. For starters, the advent of cloud-based data warehouse is upending the traditional market for analytical databases, just as a new generation of front-end BI tools streamlines the delivery of information.
One digital-native company that’s adopted modern data warehousing tools is Adore Me. The New York City firm was founded in 2011 when a student in Harvard’s MBA program couldn’t find quality lingerie at an affordable price. So he hatched a plan to provide women with customized recommendations of lingerie and other intimate apparel items, with regular deliveries from Adore Me’s private collection as part of a subscription service.
The idea took off, and Adore Me has grown quickly to the point where it has plans to open 300 retail outlets. While machine learning algorithms do much of the work of coming up with recommendations, the company employs human analysts and engineers to ensure the data feeding those algorithms is fresh and pertinent.
Much of the data engineering work fell to Diana Streche, the company’s lead BI developer. When she arrived at Adore Me four years ago, the company was analyzing data in its production MySQL database, and there was barely any automation to speak of.
“I was the SQL engine, like the SQL monkey doing that every single day,” Streche tells Datanami. “When that didn’t hold anymore we decided, ok let’s build something specifically for reporting.”
Streche tasked her BI team lead with finding a new reporting tool that Adore Me could build upon. “One day he came to me and said ‘We got this one. We got Looker, and you’re going to be talking to our CRM manager in one hour [to teach them] how it works,'” she says. “I said, I don’t even know how it works! I was terrified with that meeting, walking in into it and not knowing absolutely anything.”
Despite the short-notice, things turned out well. “That first meeting is what actually made me love it and want to keep it because everything was so intuitive,” Streche says.
Eventually, Adore Me realized that it needed to upgrade its back-end, too. While its MySQL database could handle the data volumes that the company was throwing at it, it needed more processing power than a single server could deliver.
“We only had one server that was used for multiple applications and it wasn’t holding up with the load anymore,” she says, citing some SQL queries that took one to two hours to run. “You need to do a lot of things on an architectural level [to cluster it] that we didn’t have the resources to do.”
Because clustering a MySQL database isn’t a simple matter, the company looked for a cloud offering. About a year ago, it migrated its analytical data warehouse to Google Cloud’s BigQuery offering, and it hasn’t looked back.
Streche says Looker handled the shift from MySQL to BigQuery gracefully. “I was really happy that we could seamlessly migrate bewteen the two with just a shift in connection parameters and table names, which was a huge help considering the timing,” she says. “Not having to deal with all of that and just being able to use an out of the box solution to migrate everything” was a big help.
Today, Looker is used by about 70 Adore Me users, mostly in marketing. Streche and a pair of other BI developers are in charge of setting the data definitions using LookML and defining the applications and APIs with Blocks. Once the environment is set up, users don’t require much training at all to be productive.
“You have a lot of freedom in how you design it, but you have to design it well,” Streche says. “If you design it well then it’s seamless and easy to sue and users can just hop in there and do their thing and no training required.”
Adore Me’s data scientists also tap into pipelines created in Looker. “Looker helps with data ingestion for our data science model,” Streche says. “When you’re connecting to a database, you need to have a lot of details about what tables do you use, how you extract the data, do you need to add any filters? When pulling it through Looker, it’s already curated and ready to go. You just need to select what you need.”
The data warehouse running on BigQuery is mostly composed of internal company data. But the company also brings in website data from Google Analytics, which Google makes extremely easy. “Google takes care of it actually ending up in BigQuery,” Streche says. “We don’t make it part of our ETL system because it’s already taken care of.”
Adore Me’s data warehouse isn’t as big as some. But by using modern tools and technology, the company is able to meet its reporting requirements and deliver an intuitive analytics experience without burdening administrators with too much complexity.