Follow Datanami:

Tag: ETL

Inside Fortnite’s Massive Data Analytics Pipeline

With 125 million players around the world, Fortnite has set a new standard of success for massively multi-player games. But pulling together all the servers, databases, and data pipelines to manage 92 million events per Read more…

The Seven Sins of Data Prep

Data preparation is often considered a necessary precursor to the “real” work found in visualizing or analyzing data, but this framing sells data prep short. The ways in which we cleanse and shape data for downstream Read more…

Six Reasons Why Enterprises Need a Modern Data Integration Architecture

Data is instigating change and giving rise to a new data-driven economy that is still in its infancy. Organizations across industries increasingly recognize that monetizing data is crucial to maintaining a competitive ed Read more…

Confluent Adds KSQL Support to Kafka Platform

The latest version of Confluent’s Kafka-based platform incorporates an open source streaming engine for Apache Kafka designed to allow developers using SQL to build real-time, streaming applications. Confluent, the Read more…

Why 2018 Will Be The Year Of The Data Engineer

The shortage of data scientists – those triple-threat types who possess advanced statistics, business, and coding skills – has been well-documented over the years. But increasingly, businesses are facing a shortage o Read more…

Data Mapping Approach Gains More Funding

The need for speed in aggregating data from different sources is attracting the attention of technology investors who are looking for new approaches to mapping structured and unstructured data in real time and at scale w Read more…

ETL Slowing Real-Time Analytics, Survey Finds

ETL, the extract, transfer and load tool used to move data between databases or to data warehouses, is struggling to keep pace with growing demand for real-time data analysis, resulting in operational inefficiencies and, Read more…

Dremio Emerges from Stealth with Multi-Threat Middleware

If your business analysts are struggling to connect, prepare, and query data from multiple sources in a timely and cost effective manner, you might be interested in learning about Dremio, a new open source software compa Read more…

Embattled Redshift Gets Analytics Backing

Amazon Web Services' Redshift data warehouse service got some much needed support this week with a partnership between a data management for analytics specialist and a tool developer aimed at helping RedShift users autom Read more…

Hortonworks Unveils New Offerings for AWS Marketplace

Hortonworks today took the wraps off new big data services that run on the Amazon Web Services (AWS) Marketplace. The Hadoop, Spark, and Hive services are pre-configured, and are designed to get users up and running quic Read more…

Über File System from Alluxio Gaining Enterprise Traction

It took several years, but now we're starting to see multi-hundred-node deployments of Alluxio, the distributed in-memory file system that was developed alongside Spark and Mesos at Cal Berkeley's AMPlab. By greasing the Read more…

Data Engineers in Hot Demand

The big data community has been dealing with the data scientist shortage ever since big data became a thing. Now we're learning that there's possibly an even bigger shortage of another type of data professional: the data Read more…

The Last Hadoop Data Management Tool You’ll Ever Buy?

The rise of big data has shaken up the data warehousing market, and one of the established vendors still looking to regain its footing is Informatica, which last year was taken private in a $5.3-billion leveraged buy-out Read more…

See EBCDIC Run on Hadoop and Spark

Only 20,000 or so of the big beasts still exist in the wild. They're IBM mainframes, and despite the scorn of a legacy label, they continue to run critical processes companies simply don't trust to commodity Intel boxes. Read more…

Taming Unstructured Data with Cognitive Computing

Contending with unstructured data is no longer a priority reserved for the most well-financed, IT-savvy organizations, like Google and Facebook. As the world’s data continues to increase at nearly exponential rates, th Read more…

Five Steps to Fix the Data Feedback Loop and Rescue Analysis from ‘Bad’ Data

Despite enterprises’ best intentions in enforcing top-down standardization of data sets, non-compliant data can easily seep in and, through aggregations, transformations, and standardizations, spread throughout the org Read more…

How Hadoop Solved BT’s Data Velocity Problem

Like most large corporations with millions of customers, BT (British Telecom) has an extensive collection of databases, and is constantly moving data in and out of them. But when data growth maxed out a critical ETL serv Read more…

What Informatica’s Buyout Means to Big Data Integration

Yesterday's news that Informatica has agreed to be bought out by private equity firms for $5.3 billion has stirred a frenzy of activity in the big data integration community. For those working at data integration startup Read more…

The Land of a Thousand Big Data Lakes

The prospect of storing and processing all of one's data in an enterprise data lake running on Hadoop is gaining momentum, particularly when it comes to today's massive unstructured data flows. However, given what we kno Read more…

Datanami