Follow Datanami:

Tag: Spark

Baidu In-Memory Databases Add Intel Optane

Chinese e-commerce giant Baidu is building a new platform based on Intel Corp.’s Optane DC persistent memory as a means of upgrading search engine results delivered by its in-memory databases used to feed its streaming Read more…

Skills Are Critical in Data Science Job Hunt

Those planning a career in data science have a healthy job outlook, as demand for data scientists continues to grow. While an advanced data science degree can definitely help, it's becoming increasingly apparent that hav Read more…

Data Science Back to School: Accelerate Your Education

Are you looking to get a data science degree and join the workforce as a data scientist? Then you're not alone, as thousands of young people around the world are following that same path with the hope of tapping into the Read more…

Intel Turbocharges Spark Workloads with Optane DCPMM

Intel didn't wow chip lovers earlier this year with the launch of its 2nd Generation Intel Xeon Scalable processors "Cascade Lake" processors, which are based on the same 14nm process as the first generation processors. Read more…

Serverless SQL Engine Targets Cloud Analytics

Qubole Inc., the cloud analytics vendor, has added a serverless engine to its platform aimed at simplifying complex tasks like creating data pipelines and server clusters used to scale analytics workloads in the cloud. Read more…

What’s Behind Lyft’s Choices in Big Data Tech

Lyft was a late entrant to the ride-sharing business model, at least compared to its competitor Uber, which pioneered the concept and remains the largest provider. That delay in starting out actually gave Lyft a bit of a Read more…

Microsoft Expands Hadoop on Azure

Microsoft has upgraded its open source analytics services running on Azure with a new version of Hadoop incorporating enhancements of Apache Hive and other open source analytics frameworks. The software giant (NASDAQ: Read more…

How Databricks Keeps Data Quality High with Delta

Data lakes have sprung up everywhere as organizations look for ways to store all their data. But the quality of data in those lakes has posed a major barrier to getting a return on data lake investments. Now Databricks i Read more…

MapR to Autoscale Spark and Drill Via Prebuilt Kubernetes Containers

MapR Technologies today announced a technology preview of pre-built containers for Kubernetes that will give customers new capabilities for dynamically scaling their containerized Spark and Drill applications based on de Read more…

A Decade Later, Apache Spark Still Going Strong

Don't look now but Apache Spark is about to turn 10 years old. The open source project began quietly at UC Berkeley in 2009 before emerging as an open source project in 2010. For the past five years, Spark has been on an Read more…

Data Engineering Continues to Move the Employment Needle

Interested in a career in big data? You could do well by investing your time and effort in acquiring data science skills. But you may do even better by turning yourself into a data engineer, which is a title that continu Read more…

Microsoft Invests in Databricks

Databricks, the high-flying analytics startup founded by the creators of Apache Spark, announced yet another venture funding haul this week as it hustles to meet what it says is growing demand for its analytics platform. Read more…

Presto Backers Bolster Its Open Source Origins

A new industry group will promote Presto, the popular open source distributed SQL query engine launched by Facebook engineers in 2012 as a follow-on to Apache Hive. The Presto Software Foundation launched on Thursday Read more…

Build on the AWS Cloud with Your Eyes Wide Open

Building data applications on public clouds like Amazon Web Services is a no brainer for many organizations these days. The tools for ingesting, storing, and processing data in the cloud are rapidly maturing, and best of Read more…

Movie Recommendations with Spark Collaborative Filtering

Collaborative filtering (CF)[1] based on the alternating least squares (ALS) technique[2] is another algorithm used to generate recommendations. It produces automatic predictions (filtering) about the interests of a user Read more…

Nvidia Platform Pushes GPUs into Machine Learning, High Performance Data Analytics

GPU leader Nvidia, generally associated with deep learning, autonomous vehicles and other higher-end AI-related workloads (and gaming, of course), is mounting an open source end-to-end GPU acceleration platform and ecosy Read more…

Attunity Brings CDC to Google Cloud

Enterprises that are looking to push transactional data from on-premise systems into Google's cloud environment may want to check out the latest from Attunity, which today announced support for Google Cloud Platform with Read more…

Machine Teaching Will Drive Crowdsourced Cognition into the AI Pipeline

Building high-quality artificial intelligence (AI) is hard work. It’s a specialized discipline that historically has required highly skilled specialists, aka data scientists. Any time you require some highly skilled Read more…

Project Hydrogen Unites Apache Spark with DL Frameworks

The folks behind Apache Spark today unveiled Project Hydrogen, a new endeavor that aims to eliminate barriers preventing organizations from using Spark with deep learning frameworks like TensorFlow and MXnet. It's tou Read more…

How Disney Built a Pipeline for Streaming Analytics

The explosion of on-demand video content is having a huge impact on how we watch television. You can now binge watch an entire season's worth of Grey's Anatomy at one sitting, if that suits your fancy. For a media giant Read more…

Datanami