Tag: stream processing

Managing Streaming Flink Apps Is About To Get Easier

Sep 11, 2017 |

Apache Flink has emerged as a powerful platform for building real-time stream processing applications. However, not every organization has the resources to go all in on Flink the way Netflix, Uber, and Alibaba have. Read more…

A Peek Inside Kafka’s New ‘Exactly Once’ Feature

Jul 3, 2017 |

Here’s some great news for Apache Kafka users: The open source software will support exactly once semantics for stream processing with the upcoming version 0.11 release, thereby eliminating the need for application developers to code the important feature themselves. Read more…

Yahoo’s Massive Hadoop Scale on Display at Dataworks Summit

Jun 16, 2017 |

Yahoo put its massive Hadoop investment on display this week at Dataworks Summit, the semi-annual big data conference that it co-hosts with Hortonworks.

While Hadoop is no longer the conference headliner that it once was, the platform is still critical for the daily operations of Yahoo, which officially became part of Verizon Communications this week when the $4.5 billion acquisition finally closed. Read more…

Hortonworks Shifts Focus to Streaming Analytics

Jun 14, 2017 |

Hortonworks started life providing a Hadoop distribution that allowed customers to process big data at rest. But these days, the company has shifted its much of its attention and resources to streaming analytics, or processing big data in motion. Read more…

Sparse Fourier Transform Gives Stream Processing a Lifeline from the Coming Data Deluge

Jun 13, 2017 |

When James Cooley and John Tukey introduced the Fast Fourier transform in 1965, it revolutionized signal processing and set us on course to an array of technological breakthroughs. But today’s overwhelming data sets require a new approach. Read more…

How Pandora Uses Kafka

May 31, 2017 |

As a big Hadoop user, Pandora Media is no stranger to distributed processing technologies. But when the music streaming service decided to transition its ad tracking system from a batch-oriented system into a real-time one, it brought in a new technological underpinning to serve as the core foundational element. Read more…

Google/ASF Tackle Big Computing Trade-Offs with Apache Beam 2.0

May 19, 2017 |

Trade-offs are a part of life, in personal matters as well as in computers. You typically cannot have something built quickly, built inexpensively, and built well. Pick two, as your grandfather would tell you. Read more…

The Real-Time Future of ETL

May 8, 2017 |

We’re on the cusp of a huge uptick in data generation thanks to the IoT, but most of that data will never be landed in a central repository or stored for any length of time. Read more…

Kafka ‘Massively Simplifies’ Data Infrastructure, Report Says

May 5, 2017 |

What’s behind the rapid rise in Apache Kafka? According to a new survey of Kafka users by Confluent, the commercial venture behind Kafka, the data pipeline’s capability to “massively simplify” Read more…

Flink Aims to Simplify Stream Processing

Apr 3, 2017 |

Apache Flink has emerged as a powerful framework for building real-time stream processing applications that has gained traction by some of the most progressive tech companies in the world, including at Netflix, Uber, and Alibaba. Read more…