Follow Datanami:

Tag: apache

Architect’s Guide for Selecting Scalable Data Layer Technologies

Architects and CTOs must be especially considerate in their evaluation of how a specific technology solution delivers value, and careful of any added complexity or other drawbacks a solution may bring with it. In Read more…

Cloudera Commits to 100% Open Source

The old Cloudera developed and distributed its Hadoop stack using a mix of open source and proprietary methods and licenses. But the new Cloudera will be 100% open source, just like Hortonworks, its one-time Hadoop rival Read more…

Open Source is Now a Big Data Service

Open source technologies continue to make headway across a range of industries undergoing digital conversions. The big data sector has of course led the way with a growing list of Apache Foundation projects ranging from Read more…

ODPi Tackles Hive with Latest Hadoop Runtime Spec

ODPi today unveiled the second major release of its Runtime Specification that's geared at setting a standard for Hadoop components to ensure greater interoperability among distributions and third-party products. New add Read more…

Apache Takes Storm Into Incubation

On Wednesday night, Doug Cutting, Director for the Apache Software Foundation (ASF), announced that the organization will be adding the distributed real time computation system known as Storm as the foundations newest Incubator podling. Read more…

Putting Some Real Time Sting into Hive

A coalition of Hive community enthusiasts report that they have achieved a 45x performance increase for Apache Hive through an effort they have branded “The Stinger Initiative.” The group says they are aiming at 100x improvement. Read more…

Apache Hadoop 2.0.3-Alpha Released With Future Outlook

The next generation of the Apache Hadoop open-source software framework has been given an alpha release and set free in the wild, delivering the next major milestone for the Apache Hadoop community. Read more…

BioInformatics: A Data Deluge with Hadoop to the Rescue

Apache Hadoop-based massively parallel processing is well suited to address many challenges in the growing field of BioInformatics. BioInformatics is not a “spectator sport”; this article explains how to get started via hands-on experience with the FDA Adverse Event Reporting System (FAERS). Read more…

Searching Big Data’s Open Source Roots

The face of search has changed dramatically since the first days of Google and other search engines due to many widely-used open source technologies that enable complex queries across vast sets of multi-structured data. This week we talk with Apache Mahout, Lucene and Solr guru Grant Ingersoll, now Chief Scientist at LucidWorks, about what has... Read more…

Pentaho Stirs Open Source Kettle

This week open source business intelligence vendor, Pentaho, pushed the code that powers the latest release of their Kettle offering into an Apache 2.0 license, strengthening ties to Hadoop and related projects under the same license.... Read more…

Datanami