Follow Datanami:

Tag: Spark

Big Data Benchmark Gauges Hadoop Platforms

In another indication of a maturing technology and growing demand, an industry group has released a big data analytics benchmark designed to gauge the performance of Hadoop-based systems. The Transaction Processing Pe Read more…

Merging Batch and Stream Processing in a Post Lambda World

It wasn't long ago that developers looked to the Lamba architecture for hints on how to design big data applications that needed elements of both batch and streaming data. But already, the Lamba architecture is falling o Read more…

How Spark and Hadoop Are Advancing Cancer Research

The combination of Spark and Hadoop has supercharged big data analysis across many industries and use cases by lowering the barrier of entry to advanced analytics and thereby enabling data scientists to create data-drive Read more…

Hadoop Past, Present, and Future

Every few years the technology industry seems to be consumed with a shiny new object that gets hyped far beyond reality. At worst, the inevitable bursting of the hype bubble leads to the disappearance of the technology f Read more…

DataRobot Looks to Cut Data Science Backlog

The data science automation specialist DataRobot Inc. is gaining traction in the big data market for its machine-learning application as new investors like Intel Capital fund its expanding operations. Boston-based Dat Read more…

SnappyData Gets Funded for Spark-GemFire Combo

SnappyData today announced it has received $3.65 million in Series A funding to build a business around its real-time analytics platform that combines Apache Spark, Pivotal's GemFire data grid, and an innovative data app Read more…

Apache Beam’s Ambitious Goal: Unify Big Data Development

If you're tired of using multiple technologies to accomplish various big data tasks, you may want to consider Apache Beam, a new distributed processing tool from Google that's now incubating at the ASF. One of the cha Read more…

LinkedIn Diagnostics Help Tune Hadoop Jobs

An open source tool released last by LinkedIn developers is intended to help Hadoop and Spark users analyze, tune and improve the performance of their workflows. The self-service performance-tuning tool for Hadoop dub Read more…

Reporter’s Notebook: 6 Key Takeaways from Strata + Hadoop World

The big data ecosystem was on full display at last week's Strata + Hadoop World conference in San Jose. At the ripe old age of 10, Hadoop is still the driving force, but newer frameworks like Spark and Kafka are gaining Read more…

Cutting On Random Digital Mutations and Peak Hadoop

In a wide-ranging Strata + Hadoop World talk on Wednesday that reminds us why we like Doug Cutting so much, the father of Hadoop riffed on the evolution of big data tech, the power of open source, the promise of Flink, a Read more…

Apache Flink Creators Get $6M to Simplify Stream Processing

Real-time stream processing is one of the hottest topics this week at Strata + Hadoop World, and one of the new frameworks turning heads is Apache Flink. Developed by the German company data Artisans, Flink is unique in Read more…

Finding Long-Term Solutions to the Data Scientist Shortage

As we learned in the first part of this series, the gap between demand for skilled data scientists and supply is driving salaries north of $200,000 in some areas of the country. If big data analytics is to be democratize Read more…

Machine-Learning Platform Certified For Cloudera

In the run up to next week's Hadoop confab in Silicon Valley, vendors are releasing a flock of automation and other tools aimed at beefing up the mainstream data processing framework. Among them is an attempt to incorpor Read more…

Why Hadoop Must Evolve Toward Greater Simplicity

Developers have been filing the rough edges off Apache Hadoop ever since the open source project started to gain traction in the enterprise. But if Hadoop is going to take the next step and become the backbone of analyti Read more…

From Hadoop to Zeta: Inside MapR’s Convergence Conversion

If you're a regular Datanami reader, you likely know MapR Technologies as a Hadoop distributor, one of the three "pure play" providers alongside Hortonworks and Cloudera. But with its integrated NoSQL database, a modifie Read more…

How Big Data Can Empower B2B Sales

As consumers, we've grown accustomed to having Big Data look over us. We're no longer surprised when Amazon recommends a perfect of headphones for a 14 year-old girl, or when Target reminds us it's time to buy laundry de Read more…

See EBCDIC Run on Hadoop and Spark

Only 20,000 or so of the big beasts still exist in the wild. They're IBM mainframes, and despite the scorn of a legacy label, they continue to run critical processes companies simply don't trust to commodity Intel boxes. Read more…

MapR Joins Growing List Targeting U.S. Big Data

As federal agencies struggle to upgrade their cloud and overall IT capabilities, a leading analytics vendor is setting up shop inside the Capital Beltway in a bid to boost big data capabilities. MapR Technologies said Read more…

Hortonworks Splits ‘Core’ Hadoop from Extended Services

Hortonworks today announced a major change to the way it distributes its Hadoop software. Going forward, Hortonworks plans to update "core Hadoop" components like HDFS, MapReduce, and YARN just once a year in accordance Read more…

Spark 2.0 to Introduce New ‘Structured Streaming’ Engine

The folks at Databricks last week gave a glimpse of what's to come in Spark 2.0, and among the changes that are sure to capture the attention of Spark users is the new Structured Streaming engine that leans on the Spark Read more…

Datanami