Follow Datanami:

Tag: Spark

A Platform Approach to Data Science Operationalization

Eduardo Ariño de la Rubia has been building quantitative teams for the past two decades, and knows first-hand how hard it can be to bring data science capabilities into a production setting. He understands there are mul Read more…

How Spark Illuminates Deep Learning

Data scientists everywhere are delving more deeply into deep learning (DL). If you’re only skimming the surface of this trend, you might think that the Spark community, which focuses on broader applications of machine Read more…

Big Iron Warms Up to Big Data

Despite predictions of its demise, the mainframe keeps ticking. And thanks to its longevity, customers are finding the need to integrate mainframes with big data systems is growing. Syncsort, which develops big data solu Read more…

DataRobot Delivers an ML Automation Boost for Evariant

Companies in all industries face an acute shortage of data scientists, those digital alchemists who turn raw data into gold. But for the healthcare software company Evariant, the decision to use DataRobot to automate its Read more…

Elastic Stack Searches for Bigger Data Problems

Elastic is best known as the commercial vendor behind Elasticsearch, the open source search engine that's widely used around the world. But with this month's release of Elastic Stack V5, the company is staking a claim wi Read more…

Hortonworks Unveils New Offerings for AWS Marketplace

Hortonworks today took the wraps off new big data services that run on the Amazon Web Services (AWS) Marketplace. The Hadoop, Spark, and Hive services are pre-configured, and are designed to get users up and running quic Read more…

How PRGX Is Making Its AS/400-to-Hadoop Migration Work

Many companies are using big data technologies to build new applications that can take advantage of emerging data streams, like sensor data or social media. It's not often you see established back-office applications bei Read more…

Spark ML Runs 10x Faster on GPUs, Databricks Says

Apache Spark machine learning workloads can run up to 10x faster by moving them to a deep learning paradigm on GPUs, according to Databricks, which today announced that its hosted Spark service on Amazon's new GPU cloud. Read more…

Big Performance Gains Seen Across SQL-on-Hadoop Engines

You can't really go wrong these days when it comes to picking a SQL-on-Hadoop engine. As long as you stick to the mainstream open source products like Hive, Impala, Spark SQL, and Presto, your SQL queries are likely runn Read more…

Databricks CEO on Streaming Analytics, Deep Learning, and SQL

As Apache Spark continues to gain steam, so too does Databricks, the company behind the popular distributed processing framework. At the recent Strata + Hadoop World conference, we caught up with Databricks CEO and co-fo Read more…

Yahoo Shares Algorithm for Identifying ‘NSFW’ Images

Yahoo is releasing the deep learning algorithm that it uses to detect "not safe for work" (NSFW) images to the open source community, the Web giant announced last week. Anywhere from 4% to 30% of the Internet is compo Read more…

SAP Expands Hadoop Reach With Altiscale Deal

SAP completed it acquisition of big data analytics startup Altiscale Inc. this week, saying it would fold Altiscale's data cloud and Hadoop services into its existing SAP HANA cloud and emerging analytics efforts. Rep Read more…

Converged Platform or Federated Data Plane? The Debate Heats Up

"Bring the compute to the data." That was Hadoop's calling card and solution for the problem of moving big data. However, the rise of cloud repositories and streaming technologies is causing Hadoop distributors to questi Read more…

MemSQL Delivers an ‘Exactly Once’ Real-Time Pipeline

MemSQL today unveiled a new release of its in-memory relational database that can process a real-time flow of messages from Apache Kafka using "exactly once" semantics. The NewSQL database accomplished the feat by creati Read more…

Can Hadoop Be Simple Again?

In the beginning, Hadoop had two pieces: HDFS and MapReduce. Developers knew how to use them to build applications, and IT teams knew what it took to operate them. Fast forward to 2016, and developers have a cornucopia o Read more…

Unraveling Hadoop and Spark Performance Mysteries

What do you do when your Spark or Hive job runs like molasses? If you're like most end-users who lack in-depth technical skills, the answer is "not much." Now a startup named Unravel Data is working to show you what's ac Read more…

Tracking the Ever-Shifting Big Data Bottleneck

Bottlenecks are a fact of life in IT. No matter how fast you build something, somebody will find a way to max it out. While the performance headroom has been elevated dramatically since Hadoop introduced distributed comp Read more…

Machine Learning: No Longer the ‘Fine China’ of Analytics, HPE Says

Machine learning has become a core component of companies' analytic initiatives and is no longer the "fine china" only brought out for special occasions, according to a manager with Hewlett-Packard Enterprise, which toda Read more…

Bullish Forecast for Hadoop Services

Despite major market inroads being made by Apache Spark, a new forecast estimates the global market for the Hadoop big data framework will continue to grow at a healthy clip through 2021, fueled in part by growing enterp Read more…

Talend Set for IPO This Week

There's been chatter about which big data technology company will be the next to go public after Hortonworks' (NASDAQ: HDP) initial public offering of stock in December 2014. Would Hadoop distributor MapR Technologies ma Read more…

Datanami