Follow Datanami:

Tag: Spark

Microsoft Invests in Databricks

Feb 5, 2019 |

Databricks, the high-flying analytics startup founded by the creators of Apache Spark, announced yet another venture funding haul this week as it hustles to meet what it says is growing demand for its analytics platform. Read more…

Presto Backers Bolster Its Open Source Origins

Jan 31, 2019 |

A new industry group will promote Presto, the popular open source distributed SQL query engine launched by Facebook engineers in 2012 as a follow-on to Apache Hive.

The Presto Software Foundation launched on Thursday (Jan. Read more…

Build on the AWS Cloud with Your Eyes Wide Open

Jan 9, 2019 |

Building data applications on public clouds like Amazon Web Services is a no brainer for many organizations these days. The tools for ingesting, storing, and processing data in the cloud are rapidly maturing, and best of all, they’re largely pre-integrated, which saves data scientists and engineers time and money. Read more…

Movie Recommendations with Spark Collaborative Filtering

Nov 2, 2018 |

Collaborative filtering (CF)[1] based on the alternating least squares (ALS) technique[2] is another algorithm used to generate recommendations. It produces automatic predictions (filtering) about the interests of a user by collecting preferences from many other users (collaborating). Read more…

Nvidia Platform Pushes GPUs into Machine Learning, High Performance Data Analytics

Oct 10, 2018 |

GPU leader Nvidia, generally associated with deep learning, autonomous vehicles and other higher-end AI-related workloads (and gaming, of course), is mounting an open source end-to-end GPU acceleration platform and ecosystem directed at machine learning and data analytics, domains heretofore within the CPU realm. Read more…

Attunity Brings CDC to Google Cloud

Sep 11, 2018 |

Enterprises that are looking to push transactional data from on-premise systems into Google’s cloud environment may want to check out the latest from Attunity, which today announced support for Google Cloud Platform with its change data capture (CDC) software. Read more…

Machine Teaching Will Drive Crowdsourced Cognition into the AI Pipeline

Jun 25, 2018 |

Building high-quality artificial intelligence (AI) is hard work. It’s a specialized discipline that historically has required highly skilled specialists, aka data scientists.

Any time you require some highly skilled, highly paid practitioner to accomplish something of value, you’ve introduced a bottleneck into that process. Read more…

Project Hydrogen Unites Apache Spark with DL Frameworks

Jun 5, 2018 |

The folks behind Apache Spark today unveiled Project Hydrogen, a new endeavor that aims to eliminate barriers preventing organizations from using Spark with deep learning frameworks like TensorFlow and MXnet. Read more…

How Disney Built a Pipeline for Streaming Analytics

May 14, 2018 |

The explosion of on-demand video content is having a huge impact on how we watch television. You can now binge watch an entire season’s worth of Grey’s Anatomy at one sitting, if that suits your fancy. Read more…

Presto Use Surges, Qubole Finds

Apr 18, 2018 |

Don’t look now, but Presto, the SQL engine developed by Facebook as a follow-on to Hive, is starting to catch on in a big way. According to a new survey of big data-as-a-service customers by Qubole, Presto logged impressive usage gains during 2017, and outgrew Hive and Spark across many metrics. Read more…

Do NOT follow this link or you will be banned from the site!