Follow Datanami:

Tag: HDFS

Does InfiniBand Have a Future on Hadoop?

Hadoop was created to run on cheap commodity computers connected by slow Ethernet networks. But as Hadoop clusters get bigger and organizations press the upper limits of performance, they're finding that specialized gear Read more…

Data Lake Showdown: Object Store or HDFS?

The explosion of data is causing people to rethink their long-term storage strategies. Most agree that distributed systems, one way or another, will be involved. But when it comes down to picking the distributed system-- Read more…

JethroData Indexes Its Way to $8.1M SQL Payday

There's no shortage of SQL-on-Hadoop projects in the works. Hive, Impala, Presto, HAWQ—they all tout a certain advantage. But they all look mostly the same to Eli Singer, the CEO of JethroData, which today landed $8.1 Read more…

Accelerating Hadoop® Workflows to Yield Greater Application Efficiency

As enterprise-critical decision support fully embraces big data, confusion has grown on how to best satisfy increasing demand for ever larger data analytics. Some have questioned whether Hadoop will continue to reliably Read more…

MapR Says Its Hadoop Tweaks Scale to Meet IoT Volumes

Hadoop is naturally positioned as a key architectural platform that organizations will turn to for analyzing Internet of Things (IoT) data. But according to the folks at MapR Technologies, limitations in plain vanilla Ap Read more…

Facebook Adds Another 9 to HBase Availability

As one of the largest users of Hadoop in the world, Facebook knows a thing or two about running the big data platform for a high degree of availability. Last week, the social media giant's engineering team explained how Read more…

Top 10 Netflix Tips on Going Cloud-Native with Hadoop

Four years ago Netflix made the decision to move all of its data processing--everything from NoSQL and Hadoop to HR and billing--into the cloud. While going "cloud native" on Amazon Web Services hasn't been without its challenges, the move has benefited Netflix in multiple and substantial ways. Here are 10 tips from Netflix on making the cloud work. Read more…

What Can GPFS on Hadoop Do For You?

The Hadoop Distributed File System (HDFS) is considered a core component of Hadoop, but it’s not an essential one. Lately, IBM has been talking up the benefits of hooking Hadoop up to the General Parallel File System (GPFS). IBM has done the work of integrating GPFS with Hadoop. The big question is, What can GPFS on Hadoop do for you? Read more…

Has Dirty Data Met Its Match?

One of the dirty little secrets about big data is the amount of manual effort it takes to clean the data before it can be analyzed. You may have the best and brightest data scientists on your team, but unless you liberate them from the drudgeries of digital janitorial work, you aren't getting their best work. Today, the data cleansing startup Trifacta launched its first product aimed at alleviating data professionals from the burden posed by traditional data cleansing processes. Read more…

Facebook Molds HDFS to Achieve Storage Savings

The Hadoop file system, HDFS, has been under a lot of fire over the last year as various corners of the industry have maligned the file system for some of its perceived limitations. Big data pioneer and social giant, Facebook, has seen fit to tackle their big data growth with HDFS and say they're reaping the financial rewards in the form of storage savings. Read more…

ScaleOut Building Real-Time Bridge to Hadoop with hServer

The real-time operational world can benefit from Hadoop’s power to analyze large data sets in parallel, says in-memory data grid company, ScaleOut Software, who today launched a new platform called hServer that the company says bridges real-time analytics with Hadoop. Read more…

Shared Infrastructure: Using Proven HPC Products for Big Data

Big Compute (HPC) and Big Data share common architectural constructs – for example, both commonly use commodity hardware that is tied together in a cluster and shared among users. Leveraging the capabilities that are proven in HPC to create a shared Big Data infrastructure is not only possible, it is becoming a requirement and is on the community’s roadmap. Why wait? Avoid the expense of purchasing an expensive stand-alone Hadoop cluster and save money with Big Data shared infrastructure from Univa today. Read more…

Intel Hitches Xeon to Hadoop Wagon

No longer content to sit on the sidelines of major Hadoop events, Intel has unveiled its strategy to tap into the momentum around the open source platform with its own purpose-built distribution. Read more…

Apache Hadoop 2.0.3-Alpha Released With Future Outlook

The next generation of the Apache Hadoop open-source software framework has been given an alpha release and set free in the wild, delivering the next major milestone for the Apache Hadoop community. Read more…

Panasas Gets Real About Hadoop

The realms of HPC and enterprise big data have been thrust together via the tectonic force of the Hadoop push, but according to Panasas CTO, Brent Welch, it leaves a deep chasm for many use cases that can only be filled by rethinking storage approaches. More specifically, Welch believes high performance network attached storage (NAS) can Read more…

Cloudera Runs Real-Time with Impala

This week at Strata Hadoop World we sat down Cloudera CEO, Mike Olson, to talk about the company's recent open sourced effort to add a real-time aspect to Hadoop, thus taking it beyond its batch-only roots. We also hit on emerging issues in the ecosystem, including how their competitive advantages stack up against others in the... Read more…

Quantcast Opens Exabyte-Ready File System

When ingesting data volumes that tip the multi-exabyte per year scales, even the most notorious, robust systems for managing scale can start to crumble under their own weight. When the cost and complexity of this volume mount, what’s a growing company to do but look beyond the buzz for its own solutions?...We speak with Quantcast CEO and VP of R&D about the.... Read more…

Fujitsu Puts Proprietary Twist on Hadoop

Fujitsu has released a new twist on its ever-expanding big data theme. With the release of Interstage Big Data Parallel Processing Server V1.0 today, they claim the ability to simplify massive Hadoop deployments via the injection... Read more…

StackIQ Update Rocks Big Data

Cluster management company StackIQ has released the latest version of Rocks+ with a keen eye on big data systems. StackIQ included support for major Hadoop distributions as well as refinements to handle big infrastructure. Read more…

Datanami