Tag: Spark SQL

Spark 3.0 Brings Big SQL Speed-Up, Better Python Hooks

Apache Spark 3.0 is now here, and it’s bringing a host of enhancements across its diverse range of capabilities. The headliner is an big bump in performance for the SQL engine and better coverage of ANSI specs, while e Read more…

RDBMS Remains Popular As Data Sources Grow

As the number and variety of data sources continues to explode along with proliferation of third party APIs used to connect it, data repositories such as relational databases continue to thrive while emerging software se Read more…

Spark’s New Deep Learning Tricks

Imagine being able to use your Apache Spark skills to build and execute deep learning workflows to analyze images or otherwise crunch vast reams of unstructured data. That's the gist behind Deep Learning Pipelines, a new Read more…

Big Performance Gains Seen Across SQL-on-Hadoop Engines

You can't really go wrong these days when it comes to picking a SQL-on-Hadoop engine. As long as you stick to the mainstream open source products like Hive, Impala, Spark SQL, and Presto, your SQL queries are likely runn Read more…

Spark 2.0 to Introduce New ‘Structured Streaming’ Engine

The folks at Databricks last week gave a glimpse of what's to come in Spark 2.0, and among the changes that are sure to capture the attention of Spark users is the new Structured Streaming engine that leans on the Spark Read more…

Meet Your Friendly Neighborhood Spark Sherpa

Apache Spark is the most popular big data project at the moment, with thousands of contributors cranking out code on a weekly basis. Keeping up with Spark releases is hard, and it's why Hadoop distributor Hortonworks vie Read more…

Datanami