Tag: Hadoop

TPC Crafts More Rigorous Hadoop Benchmark From TeraSort Test

Aug 18, 2014 |

While Moore’s Law has made computing and storage capacity less expensive with each passing year, the amount of data that companies are storing and the number and sophistication of the algorithms that they want to employ on that data to perform analytics is growing faster than the prices are dropping. And that means the bang for the buck of the underlying hardware and the analytics software that runs atop it matter. The trouble is that benchmarking systems takes far too Read more…

AMPLab’s Tachyon Promises to Solidify In-Memory Analytics

Aug 14, 2014 |

U.C Berkeley’s AMPLab first landed on the radar screens of data scientists with Apache Spark, which promises to provide an in-memory data processing framework to replace or augment MapReduce. More recently, the tech wizzes at AMPLab have whipped up Tachyon, a new distributed file system that sits atop HDFS and aims to allow multiple Hadoop or Spark applications and jobs to access the same data at memory speeds without fears of corrupting it. The rapid rise of Apache Spark demonstrates Read more…

Here’s Another Option for Hadoop Enterprise Search

Aug 8, 2014 |

The software stacks of many Hadoop distributions feature Apache Lucene and Solr as the enterprise search component. But the folks at the French firm Sinequa say Hadoop customers will get more actual work done–and quickly analyze massive amounts of poly-structured data from dozens of other sources in multiple languages–by using its enterprise search solution. Hadoop, machine learning algorithms, and graph databases may get most of the headlines in our big data world, but good old search engines continue to be Read more…

Dremel Builder Gets $7M for SQL-Based Supertool

Aug 5, 2014 |

Big data startup Metanautix emerged from stealth mode today by announcing a $7-million round of venture funding to further development of a SQL-based power tool. Led by the former Google engineer who headed the development of Dremel, the company aims to dissolve product and technology barriers by “re-imagining” SQL at the heart of an emerging big data supply chain. SQL is enjoying a renaissance as the big data boom continues to reverberate throughout the IT and business sectors. While emerging Read more…

Are Data Lakes All Wet?

Aug 4, 2014 |

Enterprise data management platforms known as “data lakes” are being promoted as, among other things, a potential solution to “information siloes” by combining different managed collections of data in an unmanaged data lake. The theory is that data consolidation will increase use and sharing of information while reducing storage and server costs. However, a new market study dismisses most of those claims as a “fallacy,” arguing instead that enterprises still require secure data repositories, in other words, data warehouses. At Read more…

How Streaming Analytics Helps Telcos Overcome the Data Deluge

Jul 30, 2014 |

Real-time streaming analytics is all the rage these days, as organizations seek to wring value from their data as quickly as possible. While the technology is bleeding edge for many, it’s commonplace in the telecommunications industry, where vendors like Guavus are leveraging the power of Hadoop and streaming analytics to help telcos not only survive the data deluge, but thrive within it. Things are a bit different in telecommunications. While companies in other fields may experiment with new technologies, tier-one Read more…

Enforcing Hadoop SLAs in a Big YARN World

Jul 23, 2014 |

The Apache Hadoop community has done a truly amazing job developing a scalable and versatile platform for big data analytic workloads. And with the recent introduction of YARN in Hadoop 2, we’re now able to run multiple analytic engines on our clusters simultaneously. Unfortunately, the prospect for resource contention has also gone up, and that will likely increase demand for service level agreement (SLA) enforcement. YARN made its big introduction just as companies started to move their Hadoop deployments out Read more…

Teradata Acquires Revelytix, Hadapt

Jul 22, 2014 |

Teradata Corp., the analytic data platform vendor, said it has expanded its big data portfolio with a pair of recent acquisitions. Teradata, based in Dayton, Ohio, said July 22 it has acquired the assets of Revelytix, an information management specialist, along with big data technologists and intellectual property from Hadapt. The Revelytix deal was completed on July 16; the Hadapt acquisition on July 17, Teradata said. Terms of the two acquisitions were not disclosed, the company said, because they are Read more…

HP Throws Trafodion Hat into OLTP Hadoop Ring

Jul 14, 2014 |

Hewlett-Packard last month quietly unveiled Trafodion, an ANSI-compliant relational SQL database that’s now available as an open source product. With two decades of development at HP and the new capability to run on top of HBase, Trafodion could provide a big boost to efforts to run transactional workloads on Hadoop. The database technology behind Trafodion (which is Welsh for “transaction”) has been around for a long time at Hewlett-Packard, but it was dancing perilously close to the waste bin of Read more…

Where Does Spark Go From Here?

Jul 11, 2014 |

The excitement behind Apache Spark reached an apex last week during the 2014 Spark Summit put on by Databricks, the company behind the in-memory analytics phenomenon. With a large community of users and growing support from software vendors, the future for Spark certainly appears bright. But there’s a large amount of work ahead to fulfill the promise of Spark, including hardening various components. Providing an easier-to-use alternative to MapReduce is the first use case for Spark, which is said to Read more…