Tag: Spark

Databricks Puts ‘Delta’ at the Confluence of Lakes, Streams, and Warehouses

Oct 25, 2017 |

Databricks today launched a new managed cloud offering called Delta that seeks to combine the advantages of MPP data warehouses, Hadoop data lakes, and streaming data analytics in a unifying platform designed to let users analyze their freshest data without incurring enormous complexity and costs. Read more…

Containerized Spark Deployment Pays Dividends

Aug 7, 2017 |

Hadoop has emerged as a general purpose big data operating system that can perform a range of tasks and run all kinds of processing engines. But all that power and flexibility comes with a cost, which is something that one prominent healthcare analytics firm decided it didn’t want to pay anymore. Read more…

DataRobot Reaches Out to SAS, Financial Services

Jul 24, 2017 |

Companies that use DataRobot’s software to automate data science tasks can now output models directly from SAS, the dominant analytics company whose software is widely deployed in enterprises around the world. Read more…

Taking the Data Scientist Out of Data Science

Jul 21, 2017 |

If you were a data scientist three years ago, you could pretty much write your own ticket. Everybody in the industry, it seemed, either wanted to hire a data scientist, or wanted to be one. Read more…

IBM Bolsters Spark Ties with Latest SQL Engine

Jul 18, 2017 |

IBM is extending its commitment to Apache Spark as a key component of in-memory analytics with the latest release of its SQL engine for Hadoop.

The new version of IBM Big SQL released last week also solidifies the company’s joint distribution deal with Hortonworks announced last month that includes Hortonwork’s Hadoop and stream processing distributions. Read more…

Hadoop Engines Compete in Comcast Query ‘Smackdown’

Jun 22, 2017 |

Who rules the ring when it comes to Hadoop SQL query engine performance? Can flashy newcomers like Presto and Spark take an established giant like MapReduce to the matt? Comcast recently held a competition to crown the best Hadoop engine, and the answer may surprise you. Read more…

Yahoo’s Massive Hadoop Scale on Display at Dataworks Summit

Jun 16, 2017 |

Yahoo put its massive Hadoop investment on display this week at Dataworks Summit, the semi-annual big data conference that it co-hosts with Hortonworks.

While Hadoop is no longer the conference headliner that it once was, the platform is still critical for the daily operations of Yahoo, which officially became part of Verizon Communications this week when the $4.5 billion acquisition finally closed. Read more…

Hortonworks Shifts Focus to Streaming Analytics

Jun 14, 2017 |

Hortonworks started life providing a Hadoop distribution that allowed customers to process big data at rest. But these days, the company has shifted its much of its attention and resources to streaming analytics, or processing big data in motion. Read more…

Spark’s New Deep Learning Tricks

Jun 7, 2017 |

Imagine being able to use your Apache Spark skills to build and execute deep learning workflows to analyze images or otherwise crunch vast reams of unstructured data. That’s the gist behind Deep Learning Pipelines, a new open source package unveiled yesterday by Databricks. Read more…

Pepperdata Takes On Spark Performance Challenges

May 24, 2017 |

Apache Spark has revolutionized how big data applications are developed and executed since it emerged several years ago. But troubleshooting slow Spark jobs on Hadoop clusters is not an easy task. Read more…