Tag: Hadoop

Crypto Tools Target Hadoop Security Gaps

Jul 31, 2015 |

Growing concerns about the lack of built-in security for open source databases such as Hadoop has created a need for tighter data security as these databases are scaled up to perform big data analytics. A data encryption and key management tool released this week by cyber-security specialist Thales e-Security Inc. and big data security vendor Zettaset targets open source big data distributions of Hadoop and NoSQL. The standards-based key management appliance and companying encryption software is also intended to help Read more…

Big Data, Big Misnomer

Jul 22, 2015 |

Outside of the data warehouse profession, the phrase “Big Data” is still widely misunderstood. Even Google is confused: a Google search for “define ‘big data’” returns the definition: “extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.” But on a modern laptop with Excel you can easily analyze hundreds of millions of rows of data. Google (a “Big Data” pioneer) should know better. This article looks at Read more…

Univa Gives ‘Pause’ to Big Data Apps

Jul 14, 2015 |

Scheduling workloads on today’s big analytic clusters can be a big challenge. Your team may have carefully everything lined up, only to have a last-minute change leave your schedule in shambles. One company that’s close to a solution is Univa, which today announced the addition of its “preemption” feature that allows admins to momentarily “pause” workloads so they can run a higher priority application. Historically, admins were loathe to stop HPC or big data jobs before they ended, because it could Read more…

Microsoft Unites Analytics Under ‘Cortana,’ Adds Spark Support

Jul 13, 2015 |

Microsoft today unveiled a new suite of hosted analytic services named after Cortana, the software giant’s personal assistant software for smartphones. Meanwhile, the company is prepping to begin support for Apache Spark on Azure, its public cloud platform. If you have a Windows-based smartphone, you’re probably familiar with Cortana, the smooth-talking female voice that does various digital chores at your behalf. The new Cortana Analytics Suite takes the personal assistant role and injects big data into it. The new suite Read more…

Python Versus R in Apache Spark

Jul 13, 2015 |

The June update to Apache Spark brought support for R, a significant enhancement that opens the big data platform to a large audience of new potential users. Support for R in Spark 1.4 also gives users an alternative to Python. But which language will emerge as the winner for doing data science in Spark? We spoke to Databricks Ali Ghodsi for answers. According to Ghodsi, who is Databricks’ vice president of engineering and product management, the company has been bombarded Read more…

Teradata Supports CDH and HDP with New Hadoop Appliance

Jul 9, 2015 |

Teradata today announced that customers can get its Hadoop Appliance pre-loaded with a distribution from either Cloudera or Hortonworks. The fifth generation of the analytics giant’s appliance also features more configuration options, including different types of nodes designed to run different workloads. Even though Hadoop was designed to run on low-cost, commodity Lintel servers that most IT folks are familiar with, some customers still don’t want to deal with the cost and hassle of procuring their own cluster to get Read more…

Solving Hadoop Problems, For Fun and Profit

Jul 6, 2015 |

Things move quickly in the Hadoop world, and keeping up can be hard to do. Just ask Chris Wensel, the creator of the popular open source development tool Cascading and CTO at Concurrent. While Wensel spends many hours keeping Cascading current with every Hadoop release as a service to the community, he’s got bigger fish to fry solving production Hadoop problems in enterprise accounts. “I spend a lot of CPU cycles and dollars on Amazon testing Cascading on every vendor Read more…

Big Data’s Dirty Little Secret

Jul 2, 2015 |

The twin phenomena of big data and machine learning are combining to give organizations previously unheard of predictive power to drive their businesses in new ways. But behind the big data headlines that tease us with tales of amazing insight and business optimization lurks an inconvenient truth: raw data is very dirty and requires an enormous amount of effort to clean. Data scientists are undoubtedly the rock stars of the big data movement, as they use their keen understanding of Read more…

Inside WebTrends’ Big Data Analytics Pipeline

Jul 1, 2015 |

WebTrends has been collecting and analyzing Web data on behalf of its customers since it was founded way back in 1993. Considering the exponenetial growth of the Net since then, it’s not a stretch to say WebTrends was doing big data before big data was a “thing.” But following the recent creation of a data analytics pipeline built with technologies like Hadoop, Spark, and Kafka, the company is taking its big data analytic services to a whole new level. WebTrends Read more…

Kyvos Debuts OLAP for Hadoop

Jun 30, 2015 |

Many technology pros view OLAP as a legacy technology, a holdover from the days of data warehousing that doesn’t have a place in today’s big data world. But several startups are fighting to change that perception, including Kyvos Insights, which today unveiled its OLAP-on-Hadoop solution. Twenty years ago, online analytical processing (OLAP) was the center of many enterprise data warehouse (EDW) initiatives. The technology, which is largely synonymous with the term “multi-dimensional database,” gave organizations a way to pre-index and Read more…