Follow Datanami:
February 20, 2020

Kyligence Grows OLAP Business in the Cloud


Companies that need to analyze large amounts of data have many options available to them these days. One option they may want to be aware of is Kyligence, which develops a distributed OLAP query engine that can run on prem and in the cloud.

Kyligence is the enterprise software firm behind Apache Kylin, the open source OLAP engine that was originally developed at eBay’s R&D headquarters in Shanghai, China. Kyligence was founded by the creators of Kylin, and most of the contributors of the open source project are now employees at Kyligence.

eBay created Kylin to accelerate queries on big data sets (measuring in the billions of rows). The software makes use of numerous components in the Hadoop ecosystem, including Hive, HBase, Spark, and Calcite, to essentially rearrange data into a multi-dimensional cube, which can then be queried via regular SQL clients, like Tableau, Qlik, MicroStrategy, or even Excel.

The OLAP cube approach is not new, and has been used in the business intelligence world for decades. By storing the data in a cube format, it essentially is pre-aggregated along commonly used dimensions (such as geography, customer, or time), which can significantly speed up the response time when a user submits a query.

A Kyligence data flowchart when running atop Azure (Image source: Kyligence)

Despite the advances in big data technology, the OLAP concept retains its core advantages in the modern data warehousing age. That means investments in OLAP continues to deliver benefits like faster data load times, faster response times, and the capability to serve data to large number of concurrent users than if the data is stored in a basic relational format.

As data sets continue to get bigger, customers keep running into technical challenges to query it in a timely manner. Compared to older single-system OLAP engines, today’s distributed OLAP engines from Kyligence and others continue to push the analytics envelope and deliver value to the largest companies with the biggest SQL query challenges.

Kyligence is based in Shanghai, China, and counts some of China’s largest companies in the financial services, retail, and manufacturing industries as customers. The list of Kyligence customers includes firms like Baidu, Huawei, China Mobile, Lenovo, McDonalds, China Telecom, WeBank, and L’Oreal Paris. Most of the company’s 50 to 100 paying clients are based in China, but a few are based in Japan, South Korea, and the United States.

One client, the Shanghai-based bank UnionPay International, used Kyligence to modernize a Cognos-based business intelligence system. Before it adopted Kyligence, the company was using IBM Cognos software on the backend to maintain 1,200 separate OLAP cubes. Updating those cubes required the services of more than 1,000 ETL jobs, and the process took four days to complete.

Kyligence CEO and founder Luke Han meeting with Microsoft CEO Satya Nadella (Image courtesy Kyligence)

Following the implementation of Kyligence Enterprise, bank employees kept the Cognos software on the front-end, but on the backend, it was able to reduce the number of cubes down to two cubes (or “supercubes”), both of which were kept updated with a single ETL job that completed in less than four hours, according to a case study.

Kyligence is hoping to use its momentum in China to jumpstart its business in the United States. In early 2019, the company established its US headquarters in San Jose, California, which enables it to provide local technical support for customers in North America. The company currently has more than 10 employees at the site, and is looking to hire more, according to Jennifer Li, Kyligence’s head of marketing and partnership.

“We believe at some point customers here need local support, which is why we are here,” Li tells Datanami. “We see our business growing in the US. Lots of customers are here, and we need local people to support them.”

The company’s North American strategy will feature a heavy dose of clouds. In September, Kyligence announced that its OLAP software is available on all three major public clouds. With Kyligence Cloud 3.0, customers can take advantage of native storage available on those cloud platforms, including AWS S3, Microsoft Azure Blob Store, and even Snowflake.  What’s more, the OLAP cloud offering does not have any underlying dependencies on Hadoop.

“We’ve built Kyligence Cloud with the purpose of simplifying big data analytics in the cloud,” Kyligence Founder and CEO Luke Han stated in a press release at the time. “While this lowers the TCO, it also delivers unmatched performance and high concurrency OLAP for cloud data.”

The majority of Kyligence customers (about 70%) run on-premise at this point. However, about 60% of new proof of concepts (POCs) are being conducted in the cloud, according to Li. Both sets of customers – cloud and on-prem – are expected to grow as Kyligence establishes itself as a player in the market for distributed OLAP engines.

With its elastic compute, Kyligence views the cloud is a key resource for enabling  its customers’ success. The company has the most history running on Azure, owing to its close relationship with Microsoft.  Kyligence claims that 80% of queries on Azure are delivered within one second, while the exceptions involving queries of high cardinal data are all delivered within three seconds.

Kyligence has attracted $48 million in funding since it was founded in 2016, with the most recent round, a Series C, bringing $25 million into the company’s coffers. The company is competing against the likes of AtScale, Kyvos Insights, Cloudera via its acquisition of Arcadia Data, and Dremio, which is moving up-platform from its Apache Arrow roots.

Kyligence also competes to a certain extent against Apache Kylin, the open source OLAP database at the heart of the Kyligence offering. However, there are a number of enterprise features and capabilities that are present in Kyligence Enterprise that don’t exist in Kylin. That makes the Apache Kylin community, which numbers north of 1,000, a reliable feedstock for future Kyligence customers.

Kyligence sits between multiple data sources and interfaces with BI clients via ODBC and MDX interfaces (image source: Kyligence)

Kyligence Enterprise extends and builds on Kylin in certain key areas. For starters, while Kylin offers a basic multi-dimensional OLAP engine (MOLAP), Kyligence Enterprise technically runs a hybrid OLAP (HOLAP) engine that combines relational and OLAP, according to the Kyligence website. Also, while Kylin offers pre-calculation in its MOLAP database, Kyligence Enterprise offers pre-calculation as well as an index and a cache.

Both Kyligence Enterprise and Kylin support smart pushdowns of queries. But Kyligence Enterprise offers an ODBC driver, whereas the driver must be supplied by the user with the open source software, according to Li. Kyligence Enterprise also supports read/write separation; auto modeling; refreshing of partitioned data; drilldown to raw data; and support for cube, row, column, and cell-level data access control. It also offers certification with BI vendors.

The company is currently working on its augmented AI strategy. This entails the use of machine learning to learn about the types of queries that users run. That information can then be used to optimize the delivery of similar queries going forward.

As Kyligence further establishes its presence in Silicon Valley and the United States, it will be a company to keep an eye on, particularly for those firms in need of distributed OLAP capabilities.

Related Items:

Microsoft Expands Hadoop on Azure

2016: A Big Data Year in Review