Big Data • Big Analytics • Big Insight

Features

Building a Better (Google) Earth

Dec 16, 2014 |

About 10 years ago, the folks at Google finished indexing the Internet and turned their attention to indexing planet Earth. The resulting product, Google Earth, amazed nearly everybody who used it. But for individuals with geospatial backgrounds, the bubblegum and baling wire holding the product together signaled there had to be a better way. Andrew Rogers was one of the Google engineers who worked on the first iterations of Google Earth. He helped look for efficient ways to add “layers” Read more…

Hadoop Hits the Big Time with Hortonworks IPO

Dec 12, 2014 |

The founders and investors in Hortonworks got an early Christmas present today when the company raised about $100 million in its first day of trading on the NASDAQ exchange. As the first Hadoop distributor to go public, Hortonworks’ IPO will shine the spotlight on the fast-growing Hadoop software stack and ecosystem. While Hadoop is a household word for those working in the big data bubble, it’s a foreign concept to people living in the real world. That could change in Read more…

Apache Spark and Java 8: The Big Data Team for 2015

Dec 11, 2014 |

Apache Spark with Java 8 is proving to be the perfect match for Big Data. Spark 1.0 was just released this May, and it’s already surpassed Hadoop in popularity on the Web. Java 8, the latest version, came out in March and is spreading fast: As of October, a survey from Typesafe showed that two-thirds of developers had switched to Java 8 or were planning to switch soon, faster adoption than for earlier versions. Ten years ago, Hadoop set the Read more…

NoSQL Database Scales to New Heights

Dec 10, 2014 |

Distributed NoSQL databases were designed to scale beyond what relational databases can do. That’s nothing new. But when NoSQL database vendor FoundationDB today announced that latest version 3 database was clocked handling almost 15 million random writes per second, it makes you wonder just how much scalability we might need. The first two releases of the FoundationDB database, which is based on a key-value store engine, maxed out at about 400,000 random writes per second, according to FoundationDB co-founder and Read more…

Unlocking Business Insights in Timecards

Dec 9, 2014 |

When it comes to big data analytics in the world of human capital management, most of the attention has focused on the hiring process. But companies are now starting to use advanced analytics to inspect the performance of the workforce after it’s been set. The source of the data? The humble timecard. It turns out there’s a wealth of potentially valuable data contained in timecards. The good stuff is not the obvious line item that shows Jane Smith worked 37 Read more…

How Big Data Is Remaking Customer Loyalty Programs

Dec 8, 2014 |

Retailers spend about $2 billion every year to build and run loyalty card programs in the hopes of creating lifelong, devoted customers. However, those loyalty programs often fail to deliver as advertised. But now, advanced analytic techniques running on big data platforms like Hadoop promise to help retailers get closer than ever to realizing their “one-to-one” marketing dreams. Part of the problem with traditional loyalty programs is the lack of good, clean data. When people sign up for programs, they Read more…

Embedding Data Quality and Stewardship Into Your Information Management Processes

Dec 5, 2014 |

Although established as a best practice in information management, data quality is a relatively young discipline. Unsurprisingly, its roots began in situations where the data needing to be managed was very heterogeneous. Take the case of business intelligence and data warehousing: the goal is to derive new perspectives and insights by sourcing and integrating information from multiple systems. This is often a more painful exercise than initially expected because the business is attempting to use data for an entirely different Read more…

Transforming PostgreSQL into a Distributed, Scale-Out Database

Dec 4, 2014 |

PostgreSQL users who were considering adopting a distributed NoSQL database like MongoDB or Cassandra to gain scalability benefits for big data may want to think twice about that approach following today’s launch of new software that allows PostgreSQL to scale out horizontally, just like the NoSQL databases do. PostgreSQL is one of the most popular relational databases, with millions of implementations over its 28-year history. Backers of the open source software have kept it relevant in the big data era Read more…

Stomping Out Criminal Scams with Hadoop

Dec 3, 2014 |

The growing technical sophistication of criminals is leading to an arms race to see who can scale more quickly to outmaneuver the other side. Cybercriminals are increasingly adopting hyperscale techniques to help them perpetrate fraud faster and more efficiently than ever before. That’s led the good guys to seek new capabilities of their own, including using Hadoop. One of the promising startups that’s looking to use the power of Hadoop and big data to put the kibosh on fraudsters and Read more…

Why Kafka Should Run Natively on Hadoop

Dec 2, 2014 |

Apache Kafka has become an instrumental part of the big data stack at many organizations, particularly those looking to harness fast-moving data. But Kafka doesn’t run on Hadoop, which is becoming the de-facto standard for big data processing. Now a group of developers led by DataTorrent are addressing that concern with a new project called KOYA, or Kafka on YARN. Getting Kafka into Hadoop would seem to be a no-brainer. After all, the open source message broker software already plays Read more…