Follow Datanami:

Tag: databricks

Databricks Donates Delta Code to Open Source

Databricks today announced that it's open sourcing the code behind Databricks Delta, the Apache Spark-based product it designed to help keep data neat and clean as it flows from sources into its cloud-based analytics env Read more…

Apache Spark Is Great, But It’s Not Perfect

Apache Spark is one of the most widely used tools in the big data space, and will continue to be a critical piece of the technology puzzle for data scientists and data engineers for the foreseeable future. With that said Read more…

A Decade Later, Apache Spark Still Going Strong

Don't look now but Apache Spark is about to turn 10 years old. The open source project began quietly at UC Berkeley in 2009 before emerging as an open source project in 2010. For the past five years, Spark has been on an Read more…

Databricks Open Sources MLflow to Simplify Machine Learning Lifecycle

Databricks today unveiled MLflow, a new open source project that aims to provide some standardization to the complex processes that data scientists oversee during the course of building, testing, and deploying machine le Read more…

Top 3 New Features in Apache Spark 2.3

It's tough to find a big data project that's had as much impact as Apache Spark over the past five years. The folks at Databricks, who contribute heavily to Spark (along with the wider Spark community) are keeping the pr Read more…

Databricks Puts ‘Delta’ at the Confluence of Lakes, Streams, and Warehouses

Databricks today launched a new managed cloud offering called Delta that seeks to combine the advantages of MPP data warehouses, Hadoop data lakes, and streaming data analytics in a unifying platform designed to let user Read more…

The Data Science Behind Dollar Shave Club

Dollar Shave Club burst onto the men's hygiene scene in 2011 with a hilarious video and preposterous business plan: selling subscriptions for razor blades at a ridiculously low price. Six years later, the company keeps g Read more…

Now Trending: AI Washing

First there was "green washing," where companies exaggerated the environmental benefits of their products in order to boost sales. Now technology experts are warning us about "AI washing," an equally questionable tactic Read more…

Exposing AI’s 1% Problem

We see the power of artificial intelligence every day: When Netflix recommends a movie you love, when your bank detects fraud in your account, or when Google routes you around a traffic jam. But outside of examples from Read more…

Taking the Data Scientist Out of Data Science

If you were a data scientist three years ago, you could pretty much write your own ticket. Everybody in the industry, it seemed, either wanted to hire a data scientist, or wanted to be one. But today, thanks to a conflue Read more…

What’s In the Pipeline for Apache Spark?

According to Apache Spark creator Matei Zaharia, Spark will see a number of new features and enhancements to existing features in 2017, including the introduction of a standard binary data format, better integration with Read more…

How These Banking, Energy, and Pharma Firms Use Spark

Few frameworks have gained so much popularity as quickly as Apache Spark.  The open source technology may not be ubiquitous yet in the analytics world, but it's fast approaching that point. Spark has certainly caught Read more…

Spark ML Runs 10x Faster on GPUs, Databricks Says

Apache Spark machine learning workloads can run up to 10x faster by moving them to a deep learning paradigm on GPUs, according to Databricks, which today announced that its hosted Spark service on Amazon's new GPU cloud. Read more…

Databricks CEO on Streaming Analytics, Deep Learning, and SQL

As Apache Spark continues to gain steam, so too does Databricks, the company behind the popular distributed processing framework. At the recent Strata + Hadoop World conference, we caught up with Databricks CEO and co-fo Read more…

Apache Spark Adoption by the Numbers

It's been about three years since Apache Spark burst onto the big data scene and became one of the hottest technologies on the planet. Judging by the numbers surrounding Spark's adoption—including things like salaries, Read more…

Spark 2.0 to Introduce New ‘Structured Streaming’ Engine

The folks at Databricks last week gave a glimpse of what's to come in Spark 2.0, and among the changes that are sure to capture the attention of Spark users is the new Structured Streaming engine that leans on the Spark Read more…

Spark Streaming: What Is It and Who’s Using It?

A recent study of over 1,400 Spark users conducted by Databricks, the company founded by the creators of Spark, showed that compared to 2014, 56 percent more Spark users globally ran Spark Streaming applications in 2015. Read more…

Apache Spark Gets IBM Mainframe Connection

IBM's recent embrace of Apache Spark is beginning to generate dividends in the form of open source contributions for a mainframe big data link to Spark. Big data software vendor Syncsort, Woodcliff Lake, N.J., said Tu Read more…

Spark 1.5 to Incorporate ‘Tungsten’ Upgrades

A preview release of the Apache Spark open source in-memory processing framework incorporates major performance upgrades, according to Databricks Inc., the big data processing company founded by Spark's creators. Data Read more…

IBM, Databricks Join Forces to Advance Spark

IBM has jumped on the Apache Spark bandwagon, revealing it would throw its considerable weight behind the open source in-memory processing framework that has been gaining momentum over the last year. Separately, Datab Read more…

Datanami