Data observability has emerged as one of the hottest sectors in the big data market, thanks to its focus on fixing broken data pipelines. One of the hottest players in the field is Monte Carlo, which this week announced a Series D round of funding worth $135 million, at a $1.6 billion valuation.
As companies look to data for competitive advantages, they’re finding that the costs of data quality problems continues to grow too. Gartner estimated that the average customer loses nearly $13 million per year due to data downtime and data quality problems. This is the area that Monte Carlo is addressing with its data observability offering.
Barr Moses and Lior Gavish co-founded Monte Carlo three years ago with the goal of developing tools to help companies detect problems in their ETL data pipelines and even take steps to automatically fix some of them. While ETL and ELT is viewed by some as a legacy approach to moving data, they continue to be the workhorse mechanisms for moving large amounts of data from on prem systems to the cloud, and everywhere in between.
The San Francisco company borrowed from concepts popular in the SRE and DevOps space to help address the problem of bad data flowing through data pipelines. By using connectors to take read-only copies of data directly from pipelines and machine learning techniques to spot anomalies in patterns, Monte Carlo is able to continuously monitor for common data problems, and send alerts to engineers when they are detected.
Monte Carlo looks for problems that can crop up across five main areas, including the freshness of data; its volume or completeness; whether the distribution of values is changing at the field level; whether data tables or schemas are shifting; and changes to data lineage. These are the companies five pillars of observability, which the company shared with Datanami in 2021.
There are a huge number of root causes to data issues, which isn’t Monte Carlo’s domain. (After all, if you have figured out a foolproof way to prevent humans from making data-entry mistakes, there are some folks on Sand Hill Road who would like a word).
Instead, Monte Carlo mainly looks to flag bad data as quickly as possible before it streams into downstream systems, including data warehouses and AI training systems. However, there are a handful of issues that Monte Carlo is looking to take immediate action on. Last month, the company launched Circuit Breakers to enable the company’s software to immediately end the flow of data in a data pipeline when one of these high-cost data errors, such as faulty data in a financial transaction, is detected.
The market need for data observability is growing quickly. For example, AutoTrader UK, uses Monte Carlo to keep a watchful eye on the proliferation of data models in its data analytics estate. While the Looker analytics software has been beneficial in lowering the barrier to entry for data analytics at AutoTrader UK, it has also increased the possibility that data errors can sneak into production, hence the decision to bring Monte Carlo in to automatically monitor the situation.
Monte Carlo has grown quickly as the need for data observability has increased, and users become aware there are solutions. Monte Carlo, which claims to have hundreds of customers, grew from 20 employees to 120 since late 2020, a period that coincides with several rounds of venture funding. In addition to AutoTrader UK, the company boasts customers like JetBlue, CNN, and SoFi.
Cack Wilhelm, a general partner at late-stage venture capital firm IVP, which led Monte Carlo’s Series D, said the need for high quality data has never been higher.
“Monte Carlo is charting the path forward for the data observability category and setting a precedent for the future of the modern data stack,” Wilhelm said in a press release. “After talking to dozens of Monte Carlo’s customers, two things became crystal clear: they are building a truly incredible product with near-immediate time to value, and they have one of the best teams in data. I’m excited to partner with Barr, Lior, and the rest of Monte Carlo on their vision for data reliability.”
Accel, GGV Capital, Redpoint Ventures, ICONIQ Growth, Salesforce Ventures, and GIC Singapore also participatd in Monte Carlo’s Series D. The company’s funding now totals $236 million over the past 20 months.
Related Items:
The Rise and Fall of Data Governance (Again)
Monte Carlo Hits the Circuit Breaker on Bad Data
April 26, 2024
- Google Announces $75M AI Opportunity Fund and New Course to Skill One Million Americans
- Elastic Reports 8x Speed and 32x Efficiency Gains for Elasticsearch and Lucene Vector Database
- Gartner Identifies the Top Trends in Data and Analytics for 2024
- Satori and Collibra Accelerate AI Readiness Through Unified Data Management
- Argonne’s New AI Application Reduces Data Processing Time by 100x in X-ray Studies
April 25, 2024
- Salesforce Unveils Zero Copy Partner Network, Offering New Open Data Lake Access via Apache Iceberg
- Dataiku Enables Generative AI-Powered Chat Across the Enterprise
- IBM Transforms the Storage Ownership Experience with IBM Storage Assurance
- Cleanlab Launches New Solution to Detect AI Hallucinations in Language Models
- University of Maryland’s Smith School Launches New Center for AI in Business
- SAS Advances Public Health Research with New Analytics Tools on NIH Researcher Workbench
- NVIDIA to Acquire GPU Orchestration Software Provider Run:ai
April 24, 2024
- AtScale Introduces Developer Community Edition for Semantic Modeling
- Domopalooza 2024 Sets a High Bar for AI in Business Intelligence and Analytics
- BigID Highlights Crucial Security Measures for Generative AI in Latest Industry Report
- Moveworks Showcases the Power of Its Next-Gen Copilot at Moveworks.global 2024
- AtScale Announces Next-Gen Product Innovations to Foster Data-Driven Industry-Wide Collaboration
- New Snorkel Flow Release Empowers Enterprises to Harness Their Data for Custom AI Solutions
- Snowflake Launches Arctic: The Most Open, Enterprise-Grade Large Language Model
- Lenovo Advances Hybrid AI Innovation to Meet the Demands of the Most Compute Intensive Workloads
Most Read Features
Sorry. No data so far.
Most Read News In Brief
Sorry. No data so far.
Most Read This Just In
Sorry. No data so far.
Sponsored Partner Content
-
Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!
-
Supercharge Your Data Lake with Spark 3.3
-
Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]
-
Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]
-
Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023
-
The Art of Mastering Data Quality for AI and Analytics
Sponsored Whitepapers
Contributors
Featured Events
-
AI & Big Data Expo North America 2024
June 5 - June 6Santa Clara CA United States -
CDAO Canada Public Sector 2024
June 18 - June 19 -
AI Hardware & Edge AI Summit Europe
June 18 - June 19London United Kingdom -
AI Hardware & Edge AI Summit 2024
September 10 - September 12San Jose CA United States -
CDAO Government 2024
September 18 - September 19Washington DC United States