Follow Datanami:

Technologies » Middleware

Features

Apache Hudi Is Not What You Think It Is

Vinoth Chandar, the creator of Apache Hudi, never set out to develop a table format, let alone be thrust into a three-way war with Apache Iceberg and Delta Lake for table format supremacy. So when Databricks recently ple Read more…

Top Three Pitfalls to Avoid When Processing Data with LLMs

It’s a truism of data analytics: when it comes to data, more is generally better. But the explosion of AI-powered large language models (LLMs) like ChatGPT and Google Gemini (formerly Bard) challenges this conventional Read more…

When Should You Choose a Dedicated Vector Database?

If you’re using a large language model (LLM) to develop a generative AI application, chances are pretty good that a vector database is somewhere in the mix. When it comes time to choose a vector database, there are ple Read more…

From Monolith to Microservices: The Future of Apache Spark

The days of monolithic Apache Spark applications that are difficult to upgrade are numbered, as the popular data processing framework is undergoing an important architectural shift that will utilize microservices to deco Read more…

It’s 10 pm. Do You Know Where Your Company’s Data Is?

In an ever expanding digital world, over 40% of companies still don’t have a grasp on their organization’s data footprint. Mastering data governance in today’s landscape is a pressing necessity; however, effective Read more…

News In Brief

New GenAI Models On Tap from Google, OpenAI

OpenAI and Google released major updates to their AI models this week, including OpenAI’s release of GPT-4o, which adds audio interactions to the popular LLM, and Google’s launch of Gemini 1.5 Flash and Project Astra Read more…

Forrester Slices and Dices the Vector Database Market

The large language model (LLM) revolution has transformed vector databases from obscure search tech into must-have products for AI success. But which vector database features should you look for, and which vendors are in Read more…

Anaconda Rejiggers Approach to Boost Growth Under New CEO

Anaconda created a name for itself in the data science community over the past decade by combining hundreds of the most popular Python-based statistical and machine learning packages, such as NumPy, Pandas, and SciPy, in Read more…

The Top Five Data Labeling Firms According to Everest Group

The process of annotating and labeling data is critical for supervised learning tasks, such as training a large language model (LLM) and other types of machine learning models. However, the need for human cognition and i Read more…

Altair Bolsters Analytics Offering with Cambridge Semantics Buy

Altair Engineering will soon fortify its end-to-end data science offering with the acquisition of Cambridge Semantics and its semantic graph database. Terms of the deal, which was announced yesterday, were not disclosed. Read more…

This Just In

Matillion Extends GenAI Features to Databricks Users with No-Code AI Pipeline Solutions

Jun 12, 2024 |

DENVER, CO, and MANCHESTER, England – June 12, 2024 Leading data integration provider Matillion today announces the launch of Retrieval Augmented Generation (RAG)  and pushdown AI components for Databricks, bringing AI and LLMs directly into data pipelines to transform and enrich any data type, structured or unstructured. Read more…

Splunk Introduces AI Enhancements for Observability, Security and IT Service Intelligence at .conf24

Jun 12, 2024 |

SAN FRANCISCO and LAS VEGAS – June 11, 2024 – Splunk, the cybersecurity and observability leader, introduced a collection of AI tools today across its product portfolio to enable organizations to speed up routine tasks and enhance their ability to get insights from data fast. Read more…

Oracle and NVIDIA to Deliver Sovereign AI Worldwide

Mar 19, 2024 |

AUSTIN, Texas and SAN JOSE, Calif., March 19, 2024 — Oracle and NVIDIA have announced an expanded collaboration to deliver sovereign AI solutions to customers around the world. Read more…

InfluxData Collaborating with AWS to Bring InfluxDB and Time Series Analytics to Developers Around the World

Mar 14, 2024 |

SAN FRANCISCO, Calif., March 14, 2024 – InfluxData, creator of the leading time series platform InfluxDB, today announced a collaboration with Amazon Web Services (AWS) to deliver Amazon Timestream for InfluxDB, a new managed offering for AWS customers to run InfluxDB open source natively within the AWS Management Console. Read more…

StreamNative Simplifies Data Streaming with New Apache Kafka Offering

Mar 14, 2024 |

SAN FRANCISCO, Calif., March 14, 2024 – StreamNative, the cloud-native messaging and event streaming company powered by Apache Pulsar, has announced the generally available version of Kafka on StreamNative available on ONE StreamNative Platform. Read more…

Datanami