Big Data • Big Analytics • Big Insight

Tag: Hadoop

Hadoop Hits the Big Time with Hortonworks IPO

Dec 12, 2014 |

The founders and investors in Hortonworks got an early Christmas present today when the company raised about $100 million in its first day of trading on the NASDAQ exchange. As the first Hadoop distributor to go public, Hortonworks’ IPO will shine the spotlight on the fast-growing Hadoop software stack and ecosystem. While Hadoop is a household word for those working in the big data bubble, it’s a foreign concept to people living in the real world. That could change in Read more…

Apache Spark and Java 8: The Big Data Team for 2015

Dec 11, 2014 |

Apache Spark with Java 8 is proving to be the perfect match for Big Data. Spark 1.0 was just released this May, and it’s already surpassed Hadoop in popularity on the Web. Java 8, the latest version, came out in March and is spreading fast: As of October, a survey from Typesafe showed that two-thirds of developers had switched to Java 8 or were planning to switch soon, faster adoption than for earlier versions. Ten years ago, Hadoop set the Read more…

How Big Data Is Remaking Customer Loyalty Programs

Dec 8, 2014 |

Retailers spend about $2 billion every year to build and run loyalty card programs in the hopes of creating lifelong, devoted customers. However, those loyalty programs often fail to deliver as advertised. But now, advanced analytic techniques running on big data platforms like Hadoop promise to help retailers get closer than ever to realizing their “one-to-one” marketing dreams. Part of the problem with traditional loyalty programs is the lack of good, clean data. When people sign up for programs, they Read more…

Transforming PostgreSQL into a Distributed, Scale-Out Database

Dec 4, 2014 |

PostgreSQL users who were considering adopting a distributed NoSQL database like MongoDB or Cassandra to gain scalability benefits for big data may want to think twice about that approach following today’s launch of new software that allows PostgreSQL to scale out horizontally, just like the NoSQL databases do. PostgreSQL is one of the most popular relational databases, with millions of implementations over its 28-year history. Backers of the open source software have kept it relevant in the big data era Read more…

Stomping Out Criminal Scams with Hadoop

Dec 3, 2014 |

The growing technical sophistication of criminals is leading to an arms race to see who can scale more quickly to outmaneuver the other side. Cybercriminals are increasingly adopting hyperscale techniques to help them perpetrate fraud faster and more efficiently than ever before. That’s led the good guys to seek new capabilities of their own, including using Hadoop. One of the promising startups that’s looking to use the power of Hadoop and big data to put the kibosh on fraudsters and Read more…

Why Kafka Should Run Natively on Hadoop

Dec 2, 2014 |

Apache Kafka has become an instrumental part of the big data stack at many organizations, particularly those looking to harness fast-moving data. But Kafka doesn’t run on Hadoop, which is becoming the de-facto standard for big data processing. Now a group of developers led by DataTorrent are addressing that concern with a new project called KOYA, or Kafka on YARN. Getting Kafka into Hadoop would seem to be a no-brainer. After all, the open source message broker software already plays Read more…

The Land of a Thousand Big Data Lakes

Nov 25, 2014 |

The prospect of storing and processing all of one’s data in an enterprise data lake running on Hadoop is gaining momentum, particularly when it comes to today’s massive unstructured data flows. However, given what we know of technological evolution and human nature itself, the chance of eliminating data silos and centralizing storage and compute is slim this big-data age. Data lakes make a lot of sense conceptually. Instead of allowing silos to perpetuate, an organization pools all of its resources Read more…

The Aspirational Data Lake Value Proposition

Nov 24, 2014 |

The industry hype around Hadoop and the concept of the Enterprise Data Lake has generated enormous expectations reaching all the way to the executive suite. Yet, when trying to establish a Modern Data Architecture, organizations are vastly unprepared how to house, analyze and manipulate the massive quantities of data available. All too often, they believe the only requirement is to download the Hadoop software, install it on a bunch of servers, commence loading the Data Lake, unplug the enterprise Data Read more…

How Big Data Analytics Is Shining a Light on Anonymous Web Traffic

Nov 20, 2014 |

They arrive suddenly at your website with no identification or cookies, browse product for minutes or hours on end, and then leave abruptly without a word. They’re anonymous Web visitors, and they’re the bane of the data-driven marketer. Attempts to catalog these shadowy creatures using traditional techniques often fail. The Web’s Wild West ways have led many people to crank up their privacy and security settings in an attempt to create a bubble of anonymity and protection from peering companies, Read more…

Teradata Has Hadoop Covered with MapR Partnership

Nov 19, 2014 |

Teradata completed a trifecta of sorts today when it announced a strategic partnership with MapR Technologies. The partnership comes on the heels of similar deals Teradata formed with the two other pure-play Hadoop providers, Cloudera and Hortonworks. But this deal is different insofar as what MapR can bring to the table, the new partners say. As part of the deal, Teradata will become a reseller for all of MapR‘s products and services, which is similar to the reseller deals it Read more…