Follow Datanami:

Fall Strata 2016 Coverage

fall-strata-livewire

How Data Analytics Is Helping to Fight Human Trafficking

The scourge of human trafficking exists in every major city in the United States, where it ruins the lives of victims while enriching the traffickers. Local police often are ill-equipped to deal with sophisticated trafficking schemes, resulting in solicitation charges being filed against the victims while the real perpetrators go free. But thanks to the power of big data analytics, law enforcement has new tools to build criminal cases against the traffickers themselves. Read more…

Feature Articles from Fall Strata 2016

What Cloudera Did At Strata + Hadoop World This Week

(9/30/2016)

This week’s Strata + Hadoop World conference in New York City was expected to draw more than 7,000 attendees, making it biggest big data conference on the planet. It’s also a showcase for Cloudera, which is the main sponsor of the show along with O’Reilly.

Cloudera made a range of announcements at the show this week. Read more…

Converged Platform or Federated Data Plane? The Debate Heats Up

(9/29/2016)

“Bring the compute to the data.” That was Hadoop’s calling card and solution for the problem of moving big data. However, the rise of cloud repositories and streaming technologies is causing Hadoop distributors to question whether that architecture is the best one going forward. Datanami seeks answers this week at Strata + Hadoop World. Read more…

Fresh Streaming Data: Get It While It’s Hot

(9/28/2016)

Streaming data technology that’s been simmering on the backburner for the past few years will be the main entrée at this week’s Strata + Hadoop World conference in New York City.

There’s a profound shift currently underway in the big data community, as companies look for better ways to manage the huge flows of data occurring across their networks, and find faster ways to make business decisions. Read more…

Data Engineers in Hot Demand

(9/27/2016)

The big data community has been dealing with the data scientist shortage ever since big data became a thing. Now we’re learning that there’s possibly an even bigger shortage of another type of data professional: the data engineer.

Data engineer is a relatively new position that’s a hybrid of sorts between a data analyst and a data scientist. Read more…

What If a Data Scientist Became President?

(9/26/2016)

Tonight we’ll hear from the two main candidates applying to be the next president of the United States. The debate should give voters a clearer picture of how the next president will govern. But what we’re not likely to hear is how data science will power good decision-making once they’re in the White House. Read more…

Data Science Education Evolves to Meet Surging Demand

(9/26/2016)

Here’s some good news for young data science professionals looking for that first job: your skills are in high demand and will help you land a job with an average starting salary close to $120,000. But there’s also some bad news: the field is evolving so quickly that you will continually need to refresh your skills. Read more…

News in Brief from Fall Strata 2016

Yahoo Shares Algorithm for Identifying ‘NSFW’ Images

(10/03/2016)

Yahoo is releasing the deep learning algorithm that it uses to detect “not safe for work” (NSFW) images to the open source community, the Web giant announced last week.

Anywhere from 4% to 30% of the Internet is composed of pornographic content, according to a 2011 article in Forbes. Read more…

Container Specialist Tops Strata Startup List

(9/30/2016)

A data lake startup whose platform utilizes Docker application containers to run an open source analytics engine is the winner of the startup showcase during this week’s Strata + Hadoop World in New York.

Pachyderm Inc. took first place in the biannual competition that included 11 other finalists. Read more…

SAP Expands Hadoop Reach With Altiscale Deal

(9/29/2016)

SAP completed it acquisition of big data analytics startup Altiscale Inc. this week, saying it would fold Altiscale’s data cloud and Hadoop services into its existing SAP HANA cloud and emerging analytics efforts.

Reports that SAP (NYSE: SAP) was pursuing Altiscale, based in Palo Alto, Calif., began surfacing in August. Read more…

IBM ‘DataWorks’ Leverages Watson, Spark

(9/29/2016)

Project DataWorks, a new initiative launched this week by IBM to advance its analytics push, seeks to forge a cloud-based analytics platform that combines different data types with its Watson cognitive computing technology.

The DataWorks initiative also reflects IBM’s (NYSE: IBM) embrace last year of the Apache Spark in-memory computing framework. Read more…

MapR Embraces Microservices in Big Data Platform

(9/28/2016)

The rise of microservices in recent years is one of the general IT trends that’s paralleled the emergence of big data technology. This week at the Strata Hadoop World conference, MapR will be talking about how it plans to embrace the development and management of microservices in its converged data platform. Read more…

Splunk Doubles Down on Machine Learning Analytics

(9/28/2016)

The application of machine learning to predictive analytics continues apace as a way to improve IT operations, data security and business intelligence. Among those offering frequent platform upgrades is real-time “operational intelligence” specialist Splunk Inc., which this week rolled out the latest versions of its IT, security and analytics packages that seek to “operationalize” Read more…

Commercial Kafka Distro Gets Global Smarts

(9/28/2016)

Companies operating multiple Apache Kafka clusters in on-premise and cloud data centers will benefit from a handful of new enterprise-level features unveiled at the Strata + Hadoop World conference today by Confluent, the commercial open source company behind the popular big data message bus.

Confluent today announced that its enterprise-strength Kafka offering, called Confluent Enterprise, is getting three key new capabilities in the version 3.1 release that will ship next month, including multi datacenter replication, automated cross-cluster data balancing, and a cloud-migration facility. Read more…

ODPi Tackles Hive with Latest Hadoop Runtime Spec

(9/27/2016)

ODPi today unveiled the second major release of its Runtime Specification that’s geared at setting a standard for Hadoop components to ensure greater interoperability among distributions and third-party products. New additions to the spec include Apache Hive and the Hadoop Compatible File System (HCFS). ODPi also announced more ISVs have completed interoperability testing. Read more…

Survey: Spark Going ‘Mainstream’

(9/27/2016)

That rumbling sound you hear is Apache Spark entering production deployments in public clouds along with surging use of the cluster-computing framework’s streaming and machine learning capabilities, according to a new vendor survey that also found more diverse users and use cases.

Databricks Inc., the San Francisco-based startup behind Apache Spark, released survey results on Tuesday (Sept. Read more…

Yahoo Unleashes HBase Transaction Manager

(9/26/2016)

A transaction manager for the NoSQL database HBase has been approved as an open-source incubator project, according to project sponsor Yahoo.

The HBase transaction manager dubbed “Omid” (“Hope” in Persian) is the latest in a string of Hadoop ecosystem projects backed by Yahoo that also includes Pig, Storm and YARN. Read more…

MemSQL Delivers an ‘Exactly Once’ Real-Time Pipeline

(9/26/2016)

MemSQL today unveiled a new release of its in-memory relational database that can process a real-time flow of messages from Apache Kafka using “exactly once” semantics. The NewSQL database accomplished the feat by creating a new “Create Pipeline” SQL command, and in part by bypassing Apache Spark.

Like many vendors in the big data space, MemSQL is seeking to help customers to process large amounts of streaming data arriving from Web logs, devices, and sensors on the IoT. Read more…

Inflexible Data, Analytics Fueling Failures, Survey Finds

(9/26/2016)

You would be hard pressed to find a business executive who does not believe data initiatives are critical to company growth. Still, a large number of companies say their initial data initiatives have failed due to issues like “data inflexibility.”

That’s the key finding of a data analytics study compiled Monday (Sept. Read more…

This Just In from Fall Strata 2016

Datanami