Follow Datanami:

Tag: Hive

Hortonworks Drives Stinger Home with HDP 2.1

Hortonworks today unveiled a major new release of its Hadoop distribution that puts significant new capabilities into the hands of its customers. The speed and scale of SQL processing in Apache Hive were improved with the final phase of the Stinger initiative, while the additions of Apache Storm and Apache Solr in HDP 2.1 open up new ways for customers to manipulate their data. Security and data governance were bolstered with Apache Knox and Apache Falcon, respectively, while Apache Spark is now available as a tech preview. Read more…

Microsoft’s SQL Server 2014 Adds In-Memory, Updates Columnar

Everybody wants their databases to retrieve and store information faster, and with SQL Server 2014, which is released to manufacturing today, Microsoft is tweaking its columnar data store as well as integrating its first in-memory database technology to boost transaction processing and data warehousing workloads, respectively. Read more…

Databricks Moves to Standardize Apache Spark

Databricks, the company behind open source Apache Spark, today rolled out a certification program that creates a Spark standard that big data analytic application developers can write to, and that customers can rely on. It's a smart move by Databricks, which is looking to avoid the forking that has clouded Hadoop's march into the enterprise. Read more…

Top 10 Netflix Tips on Going Cloud-Native with Hadoop

Four years ago Netflix made the decision to move all of its data processing--everything from NoSQL and Hadoop to HR and billing--into the cloud. While going "cloud native" on Amazon Web Services hasn't been without its challenges, the move has benefited Netflix in multiple and substantial ways. Here are 10 tips from Netflix on making the cloud work. Read more…

White-Glove Hadoop Cloud Service Launched by Altiscale

Hadoop users that need more handholding than Amazon can provide may want to check out the new Hadoop as a Service (HaaS) offering launched today by Altiscale. Founded by veteran technologists from Yahoo and AltaVista, the company intends to provide a high-touch experience for running and--more importantly--optimizing production Hadoop workloads in a private cloud. Read more…

The Future of Hadoop Runs on Tez, Hortonworks Says

The Hadoop community has spent much energy over the past two years trying to make Hadoop faster, simpler to program, and easier to extend to other systems. While the introduction of YARN in Hadoop version 2 helped to unhook the framework from its MapReduce roots, the folks at Hortonworks say the next step of the Hadoop journey will ride atop the Apache Tez engine. Read more…

Datanami Dishes on ‘Big Data’ Predictions for 2014

This space was going to feature a "Top 10 Big Data Predictions for 2014" story. But considering the large number of such stories currently in circulation, a different tact was in order. Instead, you'll find a selection of pertinent predictions from players in the "big data" software industry, followed by Datanami's opinion as to whether it will be spot on or whether the soothsaying will miss the mark. Read more…

Intel Goes Graph with Hadoop Distro

Intel will be targeting big retail operations with a new graph database that it unveiled today as part of its Intel Distribution for Apache Hadoop version 3 announcement. The graph engine will enable customers to make product or customer recommendations in real time, a la Netflix or Amazon, based on existing data. The chip giant also fleshed out its Hadoop distro with a 20x speedup in encryption functions, a data tokenization option, and a handful of new machine learning algorithms aimed at solving common problems. Read more…

Reaping the Fruits of Hadoop Labor in 2014

There's been a lot of work poured into Hadoop over the last few years, culminating with the launch of Hadoop version 2 in October. As we head into 2014, commercial Hadoop vendors like Hortonworks and Cloudera will continue to invest in R&D, but you can also expect to see a stronger emphasis on converting that past investment into sales and profits. However, going forward, the business models for these top two Hadoop vendors are diverging. Read more…

Finding Big Data Treasure in the Cloud

Heading into 2014, one of the big data trends that will intensify is the transition toward end-to-end data analytic services hosted in the cloud. One of the promising big data cloud services is Treasure Data, a Silicon Valley company that offers an interesting mix of MapReduce, columnar databases, and intelligent agent technology that's aimed at helping clients get a quick return on their big data investments. Read more…

Oracle Expands Use of Cloudera Hadoop in Big Data Kit

Oracle is often heralded as the biggest purveyor of "legacy" IT gear that inevitably will be replaced in this brave new big data world. But the reality is a little more complex, and that became evident yesterday when Oracle announced that it's now preloading the entire Cloudera stack with its latest Big Data Appliance, dubbed X4-2. Read more…

Syncsort Siphons Up Legacy Workloads for Amazon EMR

Syncsort is bringing its flavor of super-charged MapReduce code generation capabilities to Amazon's Elastic MapReduce cloud, the companies announced today. The IronCluster ETL as-a-service offering will allow Amazon EMR customers to generate faster MapReduce jobs from a GUI, which the companies say will make it easier to migrate expensive data warehouse workloads from Teradata or the IBM mainframe into Amazon's incredibly inexpensive cloud. Read more…

IBM Taps Zaloni to Ride Herd on Hadoop

One of the bumps on the road to Hadoop Nirvana is the overall lack of controls built into the platform. For a large enterprise, the immaturity of the stack presents major concerns in the areas of productivity and security. One vendor hoping to capitalize on the need for better Hadoop management tools is Zaloni, which got a big boost last week when IBM agreed to OEM its software and sell it as part of InfoSphere BigInsights. Read more…

Facebook’s Super Hive-Killing Query Machine Now Yours

Move over Hive. Facebook this week contributed Presto, its new in-memory distributed query engine that is up to 10 times faster than Hive, into the open source realm. With Presto, the social media giant gave itself a way to query its 300-petabyte data warehouse spread across a massive distributed cluster in sub-second manner. And now it can be yours too. Read more…

OLTP Clearly in Hadoop’s Future, Cutting Says

Think Hadoop is just for analytics? Think again, says Hadoop creator Doug Cutting, who last week predicted that, in the future, organizations will run all sorts of workloads on their Hadoop clusters, even online transaction processing (OLTP) workloads, the last bastion of the relational legacy. Read more…

Can Microsoft Become the McDonald’s of Hadoop?

Microsoft last month officially took the wraps off its Hadoop service, dubbed Windows Azure HDInsight Service. Now that the Hortonworks-based offering is GA, Microsoft is gearing up its incredibly audacious plan to reach no fewer than one billion users with its big data platform. Has Microsoft lost it? Or is the plan just crazy enough to work? Read more…

Cloudera Articulates a ‘Data Hub’ Future for Hadoop

The evolution of Hadoop from an overflow parking lot for data into a field of analytic dreams is unfolding right before our eyes. Among the vendors trying to help the elephant along is Cloudera, which used the Strata +Hadoop World conference this week to lay out its plans to remake Hadoop as a centralized "data hub" for enterprises. The firm also launched betas for its Hadoop 2 distributions, a partnership with the company behind Apache Spark, and a new cloud program for partners. Read more…

HDP 2.0: Rise of the Hadoop Data Lake

Hortonworks became the first Hadoop distributor to ship the new Hadoop version 2 software today when it announced the general availability of Hortonworks Data Platform (HDP) 2.0. The update will enable customers with small Hadoop clusters to upgrade their big data platform into a shared Hadoop service, or a data lake, a Hortonworks executive explains. Read more…

HortonWorks Reaches Out to SAS and Storm

Hortonworks this week revealed a new partnership with SAS that will enable the analytics giant to use its tools to analyze data stored in Hortonworks' Hadoop distribution. It also announced plans to integrate the Apache Storm stream processing engine into its distribution, and to ship a preview by the end of the year. Read more…

A Tale of Two Hadoop Journeys

Hadoop brings different things to different companies. For some, the Hadoop platform provides a great starting point to begin analyzing large data sets. But for established companies, Hadoop often displaces existing investments in data warehousing and business intelligent tools. Read more…

Datanami