Follow Datanami:

Tag: batch

Apache Beam Cuts Processing Time 94% for LinkedIn

Like many large companies, LinkedIn relied on the Lamba architecture to run separate batch and streaming workloads, with a form of reconciliation at the end. After implementing Apache Beam, it was able to combine batch a Read more…

Real-Time Data Streaming, Kafka and Analytics, Part 2: Going Beyond Pure Streaming

Data transaction streaming is managed through many platforms, with one of the most common being Apache Kafka. In our first article in this data streaming series, we delved into the definition of data transaction and stre Read more…

Real-Time Data Streaming, Kafka, and Analytics Part One: Data Streaming 101

The terms “real-time data” and “streaming data” are the latest catch phrases being bandied about by almost every data vendor and company. Everyone wants the world to know that they have access to and are using th Read more…

First Look at Scio, a Scala API for Apache Beam

Apache Beam has emerged as a powerful new framework for building and running batch and streaming applications in a unified manner. In its first iteration, it offered APIs for Java and Python. Thanks to the new Scio API f Read more…

Yahoo’s Massive Hadoop Scale on Display at Dataworks Summit

Yahoo put its massive Hadoop investment on display this week at Dataworks Summit, the semi-annual big data conference that it co-hosts with Hortonworks. While Hadoop is no longer the conference headliner that it once Read more…

Google/ASF Tackle Big Computing Trade-Offs with Apache Beam 2.0

Trade-offs are a part of life, in personal matters as well as in computers. You typically cannot have something built quickly, built inexpensively, and built well. Pick two, as your grandfather would tell you. But appare Read more…

Flink: Worth a Second Look

The big data ecosphere has evolved to the point where there are clear technology leaders. In the category of SQL engines that run on Hadoop, Hive and Spark are clearly the dominant products among open source developers. Read more…

Datanami