Follow Datanami:
May 31, 2022

How Metrics Standardization Enables Self-Service Analytics

Luke Han

(Blue Planet Studio/Shutterstock)

Many enterprises are on a journey to enable self-service analytics. They want to deliver self-service solutions that empower all of their users to use data and analytics for organizational benefits. An increasing number of enterprises use technologies such as cloud data lakes and cloud data warehouses to boost their digital transformation. However, technical professionals struggle with consolidating business definitions in one place to provide a single source of truth that is  trustworthy, understandable, discoverable, and cost-effective.

On the other side, business professionals struggle with getting the trusted data using the business metrics they are familiar with. In addition, the business also relies heavily on IT for analytics-driven decision-making.

Take this example of a technology company that after fast business growth and two years of data platform construction, the company has reached a massive data scale. The company’s challenges include:

  • Cannot effectively support company-level strategy
  • No unified business semantics
  • Low data platform adoption as the business has no trust in data.
  • Massive data scale: 5.7K Operational Data Store (ODS) tables grow into 1 million data warehouse (DW) tables.
  • Out of control lineage: One core table TX_ORDERS has 10K direct descendants.
  • Several duplicated ETLs and wasted computation.

These issues are so detrimental that this company has fallen into the data swamps with no trust in the data, let alone business self-service analysis.

This company expects to transition its existing approach to governed metrics to replace its existing self-serving ETL. This will help the company save millions of its IT budget each year, once metrics are governed.

What Is a Metrics Store?

A great solution to these problems is the metrics store. A metrics store is a middle layer between upstream data warehouses/data sources and downstream business applications.

Metrics store decouples metrics definition from the BI reporting and data warehouses. And the teams who own the metrics can define their metrics one time in the metrics store, forming that single source of truth, and consistently reuse the metrics across BI, automation tools, business workflows, or even advanced analytics.

Metrics Matter

“When it comes to managing business processes or any production process for that matter unless the performance is tracked continuously, how do you know if you are improving?” This famous quote by Peter Drucker speaks to the idea, if you can’t measure, you can’t improve it.

A metrics store is a management system first, then a data system. Like ERP, its core is to improve the management level; what big data technology improves is the accuracy of measurement and management efficiency. All big data technologies, data warehouses, data lakes, ETL/ELT, various BI, and reports are used to guide management decisions, and the technology itself is not the purpose. If enterprises want to optimize their management system, metrics work as the critical stepping stone.

Placing the metrics within the analytics and BI tools is a natural option. After all, it is intuitive to put metrics where they will be consumed. However, it introduces the problem of discrepancy. Metrics definitions that reside in the BI applications are siloed and hard to reuse across multiple applications. When you have numerous BI tools in your organization (which is a typical case as each business unit would have its preference of BIs), it is hard to standardize the metrics across BI platforms.

Another typical solution is to place metrics definitions and calculations in a data warehouse. However, this option also raises two issues:

  • Similar to BI tools, a broad set of analytic engines are used to support various use cases. For that reason, making a single unified metrics layer on top of all of them are unlikely to be possible.
  • Every data warehouse practitioner understands how challenging the data in the warehouse is for business users to understand. The learning curve for business is high when metrics are placed in the data warehouse.

How Does the Metrics Store Help?

So how does the metrics store help to resolve these issues? Let’s go back to the technology company example. Instead of creating an excessive number of aggregated tables in the data warehouse with no governance, the technology company can standardize the business requirement by placing a metrics store between the ODS tables and business applications. The IT team can manage metrics in one place and bring metrics standardizations to all business teams. The business requirements can be standardized and reused in the metrics store with 2,000 base metrics. This standardization can save up to 90%- 95% of the entire ETL process.

The business user can create their own metrics self-served at the consumption side, where the last mile of analytics is. It is a process of governed self-service innovation because business users can generate derived metrics based on well-governed base metrics.

Benefits for IT Governance and Business Innovation

A middle ground between placing the metrics in the data warehouse/data lake and the BI tier is to put the metrics in a standalone metrics repository, the metrics store. The metrics store helps enterprises to work around  some of the silo and trustworthy issues, providing the below benefits:

Self-service of business analytics: Businesses can easily reuse and create their own metrics self-served with no IT involvement.

Data Trust : Metrics store brings the single source of truth for business. As the metrics are standardized and well-governed, the business team will regain their confidence and trustworthiness in data.

Governance of data: In the previous approach, everyone could create metrics in BI or DW, leading to chaos and poor data governance. With a single repository of reusable metrics, IT can easily track the lineage and usage of data.

Cost-effectiveness of data management: Metrics store helps reduce the chaos in the ETL process and governance, saving countless IT efforts for the enterprise. 

Luke Han is a Co-Founder and CEO of Kyligence, as well as co-founder and Project Management Committee member for Apache Kylin. He was the first ever top-level project VP of the Apache Software Foundation in China. He is also Microsoft Regional Director (RD) and Tencent Cloud Valuable Professional (TVP).

Related Items:

The Rise and Fall of Data Governance (Again)

50 Years Of ETL: Can SQL For ETL Be Replaced?

Why Getting the Metrics Right Is So Important in Machine Learning