June 30, 2016

ClearStory Patent Covers Data Harmonization Tool

George Leopold

(Zerbor/Shutterstock.com)

A U.S. patent awarded this month to ClearStory Data, the big data preparation tool specialist, covers its automated data harmonization tool designed to work across disparate data sources and a variety of data types.

Such data prep tools are gaining favor as the amount and types of unstructured data from sources like social media and sensors continue to skyrocket.

ClearStory, Menlo Park, Calif., said U.S. Patent 9,372,913, “Apparatus and method for harmonizing data along inferred hierarchical dimensions,” covers its in-memory, Spark-based data harmonization platform. The tool is designed to speed data prep and blending.

According to the patent award, ClearStory’s proprietary technique produces a “first inferred” data type that is used to augment “first received” data with values that aggregate initial data with a “first hierarchical dimension.” Those steps are repeated on a second data type to create a second hierarchical dimension.

The two hierarchies are then harmonized to a lowest common unit value, the patent abstract states, adding: “A first visualization of the first received data is provided based upon the lowest common unit value. A second visualization of the second received data is provided based upon the lowest common unit value.”

The company said the patent protection covers its deep data inference capability along with semantic data recognition that it claims can converge and harmonize multiple data sets “on-the-fly.” In addition, Apache Spark in-memory framework is designed to eliminate the need for lengthy data “pre-modeling”. The goal is a faster path from data access to data preparation across sources that differ in terms of structure, size and “velocity,” the company said.

While Apache Spark is the native in-memory data processing engine addressed in the ClearStory patent, the company said its data harmonization platform is not limited to Spark in-memory processing. “The patent is associated with one of the core elements of ClearStory’s solution and interconnects data inference, data harmonization and the associated granular logical and physical metadata,” the company said in a statement. The patent “includes the architectural approach for how complex data is distributed in-memory, and data linkages across the data pipeline from inferring results to seeing harmonized results.”

ClearStory and other data prep specialists are seeking to overcome roadblocks faced by traditional business intelligence and data science approaches in accessing and combining a growing variety of evolving data sources. Data prep practitioners are steadily adopting automated data harmonization approaches to reduce costly and time-consuming data wrangling and pre-modeling to speed analysis of huge data sets.

ClearStory’s patented approach based on deep data inference and hierarchical dimensions automatically infers data types based on machine-based pattern recognition. That capability is said to eliminate the need to pre-model data or specify definitions for various attributes.

The data harmonization process then identifies and ranks data relationships based on inferred data types and semantics in the source data. The results are then blended, accounting for data sets that differ in terms of granularity and scale, the company said.

The U.S. patent award for the inference-based data harmonization platform is among five patent applications filed by ClearStory covering data prep technologies based on scalable, machine-based approaches to self-service data analysis.

Recent items:

Why Data Prep is Booming

Automating the Pain Out of Big Data Transformation

Applications: Enterprise Analytics, Predictive Analytics, Visualization

Technologies: Frameworks

Sectors: Financial Services, Retail

Vendors: Clearstory

Tags: data harmonization, data prep, data preparation, data wrangling, patents, pre-modeling

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

ClearStory Patent Covers Data Harmonization Tool

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 10, 2024

May 9, 2024

May 8, 2024

May 7, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

ClearStory Patent Covers Data Harmonization Tool

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 10, 2024

May 9, 2024

May 8, 2024

May 7, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link