August 5, 2022

Second Half 2022 Tech Predictions for Data and AI

Alex Woodie

(amgun/Shutterstock)

As we emerge from halftime of the show that is the year 2022, it’s time to take stock of where we’ve come this year in big data, advanced analytics, and AI, and assess where we’re likely to go next.

Based on where we’ve been so far in 2022, Datanami feels confident in making these five predictions for the remainder of the year.

Data Observability Continues to Run

The first half of the year was huge for data observability, which gives customers better visibility and metrics on what’s going on with data streams. As data becomes more important for decision-making, the health and usability of that data becomes more important too.

We saw a number of data observability startups gaining hundreds of millions of dollars in venture funding, including Cribl (Series D worth $150 million); Monte Carlo (Series D worth $135 million); Coralogix (Series D worth $142 million); and others. Others making news include Bigeye, which rolled out metadata metrics; StreamSets, which was bought by Software AG for $580 million; and IBM, which bought observability startup Databand last month.

This momentum will continue in the second half of 2022, as more data observability startups come out of the woods and existing ones seek to solidify their place in this nascent market.

Is real-time data poised for a surge? (Blue Planet Studio/Shutterstock)

Real-Time Data Pops

Real time data has been sitting on the back burner for years, serving some niche use cases but really not seeing widespread use among regular businesses. But thanks to the COVID pandemic and associated shake-up in business plans over the past couple of years, the conditions are now ripe for real time data to make the jump into mainstream tech circles.

“I think streaming is finally happening,” Databricks CEO Ali Ghodsi said at the recent Data + AI Summit, noting a 2.5X growth in streaming workloads on the company’s cloud-based data platform. “They’re having more and more AI use cases that just need to be real-time.”

In-memory databases and in-memory data grids are also poised to benefit from the real time renaissance (if that’s what it is). RocksDB, a speedy analytics database that has augmented event-based systems like Kafka, now has a drop-in replacement called Speedb. SingleStore, which combines OLTP and OLAP capabilities in a single relational framework, hit a $1.3 billion valuation in a funding round last month.

There’s also StarRocks, which recently got funded for a speedy new OLAP database based on Apache Doris; Imply, which cleared a $100 million Series D in May to continue its Apache Druid-based real-time analytics business; and DataStax, which added Apache Pulsar to its Apache Cassandra kit, raised $115 million to drive real-time application development. Datanami expects this focus on real-time data analysis to continue.

Regulatory Growth

It’s been four years since GDPR went into effect, putting cavalier big data users on notice and hastening the rise of data governance as a necessary ingredient in responsible data programs. In the US, the task of regulating data access has fallen to the states, and California is leading the way with CCPA, which mimics the GPDR in many ways. But more states are likely to follow suit, complicating the data privacy equation for US companies.

But GDPR and CCPA are just the beginning of the regulations. We’re also in the midst of the death of the third-party cookie, which is making it harder for companies to track what users do online. Google’s decision to delay the end of third-party cookies on its platform until January 1, 2023 gave marketers some extra time to adapt, but the information from the cookies will be tough to replicate.

In addition to data regulations, we’re on the cusp of new regulations on the use of AI. The European Union introduced the AI Act in 2021, and experts predict it could become law by the end of 2022 or early 2023.

Battle of the Data Table Formats

A classic tech battle is shaping up over new data table formats that will determine how data is stored in big data systems, who can access it, and what users can do with it.

Apache Iceberg has gained steam in recent months as a potential new standard for data table formats. Cloud data warehouse giants Snowflake and AWS came out early this year in support of Iceberg, which provides transactions and other controls on data and emerged from work at Netflix and Apple. Cloudera, the former Hadoop distributor, also backed Iceberg in June.

But the folks at Databricks are offering an alternative in the Delta Lake table format, which offers similar capabilities as Iceberg. The Apache Spark backers originally developed Delta Lake table format in a proprietary manner, which led to accusations that Databricks was setting customers up for lock-in. But at the Data + AI Summit in June, the company opened announced it was committing the entirety of the format to open source, thereby letting anyone use it.

Lost in the shuffle is Apache Hudi, which also provides consistency in data as it sits in big data repositories and is accessed by various compute engines. Onehouse, a venture backed by Apache Hudi’s creators, launched earlier this year with a Hudi-based lakehouse platform.

The big data ecosystem loves competition, so it will be interesting to watch these formats evolve and battle it out over the rest of 2022.

Language AI Continues to Wow

The cutting edge of AI is getting sharper by the month, and today, the tip of the AI spear is the large language models, which keep getting better. In fact, the large language models have gotten so good that a Google engineer in June claimed that the company’s LaMDA conversational system had become sentient.

The AI isn’t sentient yet, but that doesn’t mean they’re not useful to the enterprise. We’re reminded that Salesforce has a large langauge model (LLM) project called CodeGen, which seeks to understand source code and even generate its own code in different programming languages.

Last month, Meta (the parent company of Facebook) unveiled a large language model that can translate among 200 languages. We’ve also seen efforts to democratize AI through projects like BigScience Large Open-science Open-access Multilingual language model,” or BLOOM.

What are your predictions for the rest of 2022? Contact us to let us know.

Applications: Predictive Analytics

Vendors: table format

Tags: AI Act, AI regulation, big data, data observability, large langauge model, real-time data

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Second Half 2022 Tech Predictions for Data and AI

Data Observability Continues to Run

Real-Time Data Pops

Regulatory Growth

Battle of the Data Table Formats

Language AI Continues to Wow

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Second Half 2022 Tech Predictions for Data and AI

Data Observability Continues to Run

Real-Time Data Pops

Regulatory Growth

Battle of the Data Table Formats

Language AI Continues to Wow

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link