Follow Datanami:

Tag: Spark

IBM Bolsters Spark Ties with Latest SQL Engine

IBM is extending its commitment to Apache Spark as a key component of in-memory analytics with the latest release of its SQL engine for Hadoop. The new version of IBM Big SQL released last week also solidifies the com Read more…

Hadoop Engines Compete in Comcast Query ‘Smackdown’

Who rules the ring when it comes to Hadoop SQL query engine performance? Can flashy newcomers like Presto and Spark take an established giant like MapReduce to the matt? Comcast recently held a competition to crown the b Read more…

Yahoo’s Massive Hadoop Scale on Display at Dataworks Summit

Yahoo put its massive Hadoop investment on display this week at Dataworks Summit, the semi-annual big data conference that it co-hosts with Hortonworks. While Hadoop is no longer the conference headliner that it once Read more…

Hortonworks Shifts Focus to Streaming Analytics

Hortonworks started life providing a Hadoop distribution that allowed customers to process big data at rest. But these days, the company has shifted its much of its attention and resources to streaming analytics, or proc Read more…

Spark’s New Deep Learning Tricks

Imagine being able to use your Apache Spark skills to build and execute deep learning workflows to analyze images or otherwise crunch vast reams of unstructured data. That's the gist behind Deep Learning Pipelines, a new Read more…

Pepperdata Takes On Spark Performance Challenges

Apache Spark has revolutionized how big data applications are developed and executed since it emerged several years ago. But troubleshooting slow Spark jobs on Hadoop clusters is not an easy task. In fact, it may even be Read more…

Cloudera Unveils Altus to Simplify Hadoop in the Cloud

Running Hadoop, whether on-premise or in the cloud, is neither simple nor easy. Administrators with specialized skills are needed to configure, manage, and maintain the clusters for their clients, who are data scientists Read more…

Google/ASF Tackle Big Computing Trade-Offs with Apache Beam 2.0

Trade-offs are a part of life, in personal matters as well as in computers. You typically cannot have something built quickly, built inexpensively, and built well. Pick two, as your grandfather would tell you. But appare Read more…

Masking Technical Complexity in the Security Data Lake

Today's growing cybersecurity threat demands a sophisticated response, one that increasingly involves the utilization of big data technologies like parallel file systems and machine learning. However, some security exper Read more…

Iguazio Re-Architects the Stack for Continuous Analytics

When it comes to modern big data architectures, you will typically find lots of different components, engines, and moving parts, each of which tackles part of the problem. One vendor with bold vision of re-architecting t Read more…

Learning from Your Data: Essential Considerations

For any organization undergoing digital transformation, a primary consideration is how to find, capture, manage and analyze big data. They are looking to big data and data science to facilitate the discovery of analytics Read more…

Hortonworks Touts Hive Speedup, ACID to Prevent ‘Dirty Reads’

If you're considering using Hadoop for SQL-based analytics and BI, you'll be interested in the latest news out of Hortonworks, which today unveiled a new release of its flagship data platform that boasts a fast new relea Read more…

Meet Ray, the Real-Time Machine-Learning Replacement for Spark

Researchers at UC Berkeley's RISELab have developed a new distributed framework designed to enable Python-based machine learning and deep learning workloads to execute in real-time with MPI-like power and granularity. Ca Read more…

SAP Vora Gets Analytics, Cloud Upgrades

Building on its acquisition of Hadoop specialist Altiscale Inc., SAP is combining the latest release of its Vora in-memory distributed computing platform with its big data cloud as it extends the Apache Spark framework t Read more…

MapR Extends Its Platform to the Edge

MapR Technologies today unveiled MapR Edge, an extension of its converged data platform that lets customers install MapR nodes practically anywhere they want. The new offering runs on small portable PCs like the Intel Read more…

Hadoop Has Failed Us, Tech Experts Say

The Hadoop dream of unifying data and compute in a distributed manner has all but failed in a smoking heap of cost and complexity, according to technology experts and executives who spoke to Datanami. "I can't find a Read more…

2017 Is the Year of AI. Or Is It?

The media often likes to proclaim "The Year of This" or "The Year of That." With the greater attention given to advancing capabilities in artificial intelligence and machine learning, it seemed like a no-brainer to decla Read more…

Dr. Elephant Steps Up to Cure Hadoop Cluster Pains

Getting jobs to run on Hadoop is one thing, but getting them to run well is something else entirely. With a nod to the pain that parallelism and big data diversity brings, LinkedIn unveiled a new release of Dr. Elephant Read more…

Can Big Data Tame the Chaos of Virtualized IT?

The "software-defined" revolution is driving private data centers toward AWS-like efficiency. However, the virtualization of hardware, storage, and networking—not to mention agile coding techniques and a rapid-fire "De Read more…

Inside IBM ML: Real-Time Analytics On the Mainframe

"Bring the compute to the data," is a common refrain you hear in the big data age. Now IBM is heading that advice with today's launch of IBM Machine Learning for z/OS, a new offering due this quarter that will bring Wats Read more…

Datanami