Follow Datanami:

Tag: Spark

Machine Teaching Will Drive Crowdsourced Cognition into the AI Pipeline

Jun 25, 2018 |

Building high-quality artificial intelligence (AI) is hard work. It’s a specialized discipline that historically has required highly skilled specialists, aka data scientists.

Any time you require some highly skilled, highly paid practitioner to accomplish something of value, you’ve introduced a bottleneck into that process. Read more…

Project Hydrogen Unites Apache Spark with DL Frameworks

Jun 5, 2018 |

The folks behind Apache Spark today unveiled Project Hydrogen, a new endeavor that aims to eliminate barriers preventing organizations from using Spark with deep learning frameworks like TensorFlow and MXnet. Read more…

How Disney Built a Pipeline for Streaming Analytics

May 14, 2018 |

The explosion of on-demand video content is having a huge impact on how we watch television. You can now binge watch an entire season’s worth of Grey’s Anatomy at one sitting, if that suits your fancy. Read more…

Presto Use Surges, Qubole Finds

Apr 18, 2018 |

Don’t look now, but Presto, the SQL engine developed by Facebook as a follow-on to Hive, is starting to catch on in a big way. According to a new survey of big data-as-a-service customers by Qubole, Presto logged impressive usage gains during 2017, and outgrew Hive and Spark across many metrics. Read more…

Making Hadoop Relatable Again

Mar 26, 2018 |

There has been much debate over the future of Hadoop in recent months. Should it work more like a cloud object store? Should it support GPUs and FPGAs, Docker or Kubernetes (or both)? Read more…

Weighing Open Source’s Worth for the Future of Big Data

Feb 26, 2018 |

The open source software movement began in earnest 20 years ago, when a group of technology leaders in Silicon Valley coined the term as an alternative to the repugnant “free software.” Read more…

DataTorrent Glues Open Source Componentry with ‘Apoxi’

Feb 22, 2018 |

Building an enterprise-grade big data application with open source components is not easy. Anybody who has worked with Apache Hadoop ecosystem technology can tell you that. But the folks at DataTorrent say they’ve found a way to accelerate the delivery of secure and scalable big data applications with Apoxi, a new framework they created to stitch together major open source components like Hadoop, Spark, and Kafka, in an extensible and pluggable fashion. Read more…

The Hybrid Database Capturing Perishable Insights at Yiguo

Feb 22, 2018 |

Yiguo.com is the largest B2C fresh produce online marketplace in China, serving close to 5 million users and more than 1,000 enterprise customers. We have long devoted ourselves to providing fresh food for ordinary consumers and have gained popularity since our founding in 2005. Read more…

ParallelM Aims to Close the Gap in ML Operationalization

Feb 21, 2018 |

A startup named ParallelM today unveiled new software aimed at alleviating data scientists from the burden of manually deploying, monitoring, and managing machine learning pipelines in production.

Dubbed MLOps, ParallelM‘s software helps to automate many of the operational tasks required to turn a machine learning model from a promising piece of code running nn Spark, Flink, TensorFlow, or PyTorch processing engines into a secure, governed, and production-ready machine learning system. Read more…

Snowflake Taps Qubole for Deep Machine Learning in the Cloud

Feb 13, 2018 |

Organizations storing big data in Snowflake’s cloud data warehouse can now run machine learning and deep learning algorithms against that data thanks to a new partnership with Qubole.

The two companies today announced a partnership that will allow Qubole’s big data processing engines, including Apache Spark and TensorFlow, to read and write data to Snowflake’s data warehouse. Read more…

Do NOT follow this link or you will be banned from the site!