Follow Datanami:

Tag: Spark

Big Performance Gains Seen Across SQL-on-Hadoop Engines

You can't really go wrong these days when it comes to picking a SQL-on-Hadoop engine. As long as you stick to the mainstream open source products like Hive, Impala, Spark SQL, and Presto, your SQL queries are likely runn Read more…

Databricks CEO on Streaming Analytics, Deep Learning, and SQL

As Apache Spark continues to gain steam, so too does Databricks, the company behind the popular distributed processing framework. At the recent Strata + Hadoop World conference, we caught up with Databricks CEO and co-fo Read more…

Yahoo Shares Algorithm for Identifying ‘NSFW’ Images

Yahoo is releasing the deep learning algorithm that it uses to detect "not safe for work" (NSFW) images to the open source community, the Web giant announced last week. Anywhere from 4% to 30% of the Internet is compo Read more…

SAP Expands Hadoop Reach With Altiscale Deal

SAP completed it acquisition of big data analytics startup Altiscale Inc. this week, saying it would fold Altiscale's data cloud and Hadoop services into its existing SAP HANA cloud and emerging analytics efforts. Rep Read more…

Converged Platform or Federated Data Plane? The Debate Heats Up

"Bring the compute to the data." That was Hadoop's calling card and solution for the problem of moving big data. However, the rise of cloud repositories and streaming technologies is causing Hadoop distributors to questi Read more…

MemSQL Delivers an ‘Exactly Once’ Real-Time Pipeline

MemSQL today unveiled a new release of its in-memory relational database that can process a real-time flow of messages from Apache Kafka using "exactly once" semantics. The NewSQL database accomplished the feat by creati Read more…

Can Hadoop Be Simple Again?

In the beginning, Hadoop had two pieces: HDFS and MapReduce. Developers knew how to use them to build applications, and IT teams knew what it took to operate them. Fast forward to 2016, and developers have a cornucopia o Read more…

Unraveling Hadoop and Spark Performance Mysteries

What do you do when your Spark or Hive job runs like molasses? If you're like most end-users who lack in-depth technical skills, the answer is "not much." Now a startup named Unravel Data is working to show you what's ac Read more…

Tracking the Ever-Shifting Big Data Bottleneck

Bottlenecks are a fact of life in IT. No matter how fast you build something, somebody will find a way to max it out. While the performance headroom has been elevated dramatically since Hadoop introduced distributed comp Read more…

Machine Learning: No Longer the ‘Fine China’ of Analytics, HPE Says

Machine learning has become a core component of companies' analytic initiatives and is no longer the "fine china" only brought out for special occasions, according to a manager with Hewlett-Packard Enterprise, which toda Read more…

Bullish Forecast for Hadoop Services

Despite major market inroads being made by Apache Spark, a new forecast estimates the global market for the Hadoop big data framework will continue to grow at a healthy clip through 2021, fueled in part by growing enterp Read more…

Talend Set for IPO This Week

There's been chatter about which big data technology company will be the next to go public after Hortonworks' (NASDAQ: HDP) initial public offering of stock in December 2014. Would Hadoop distributor MapR Technologies ma Read more…

How Auto Insurers Detect and Use Your Driving ‘Fingerprint’

You may not know it, but the way you drive is unique--sort of like a fingerprint. How fast you drive, how tight you turn, and how long you idle in the driveway before hitting the road all help to identify you from others Read more…

Supercharging Apache Spark with Flash and NoSQL

Apache Spark has become the defacto standard computational engine in the big data world. But as an in-memory technology, Spark has limitations. One of the ways people are getting around those limitations is by pairing Sp Read more…

Concord Claims 10x Performance Edge on Spark Streaming

Organizations that are looking for a stream processing engine upon which to build fast data applications featuring high-throughput and low-latency may want to check out Concord, a new framework that emerged from the ad-t Read more…

How TransUnion Maximizes Data Science Tools and Talent

You may know TransUnion as one of the credit bureaus that controls the interest rate on your new loan. But in fact the company does much more, and has solutions around fraud detection, collections, and marketing, among o Read more…

Investments in Fast Data Analytics Surge

Companies are quickly ramping up their investments fast data analytics and real-time stream processing frameworks and lowering spending on batch technologies in an attempt to get on top of growing data volumes and veloci Read more…

Actian Reasserts Performance Claims With VectorH

The latest version of SQL-on-Hadoop specialist Actian Corp.'s Vector database tightens integration with Apache Spark to widen access to new data sources while adding enterprise features required to move Hadoop-based anal Read more…

MongoDB Struts Its NoSQL Stuff in NYC

When you think about giants of the technology world, MongoDB may not come to mind. But judging by the big strides this up-and-coming NoSQL database vendor is making, and the aggressive roadmap it put forth today at the t Read more…

What’s Hot This Summer: Data Science Bootcamps

Summer is here and temperatures are rising. While some of us take vacations or cool off at the beach, prospective data scientists are heating up their job prospects by participating in one of a growing number of data sci Read more…

Datanami