Big Data • Big Analytics • Big Insight

Technologies » Frameworks

Features

Apache Spark and Java 8: The Big Data Team for 2015

Dec 11, 2014 |

Apache Spark with Java 8 is proving to be the perfect match for Big Data. Spark 1.0 was just released this May, and it’s already surpassed Hadoop in popularity on the Web. Java 8, the latest version, came out in March and is spreading fast: As of October, a survey from Typesafe showed that two-thirds of developers had switched to Java 8 or were planning to switch soon, faster adoption than for earlier versions. Ten years ago, Hadoop set the Read more…

How Big Data Is Remaking Customer Loyalty Programs

Dec 8, 2014 |

Retailers spend about $2 billion every year to build and run loyalty card programs in the hopes of creating lifelong, devoted customers. However, those loyalty programs often fail to deliver as advertised. But now, advanced analytic techniques running on big data platforms like Hadoop promise to help retailers get closer than ever to realizing their “one-to-one” marketing dreams. Part of the problem with traditional loyalty programs is the lack of good, clean data. When people sign up for programs, they Read more…

Embedding Data Quality and Stewardship Into Your Information Management Processes

Dec 5, 2014 |

Although established as a best practice in information management, data quality is a relatively young discipline. Unsurprisingly, its roots began in situations where the data needing to be managed was very heterogeneous. Take the case of business intelligence and data warehousing: the goal is to derive new perspectives and insights by sourcing and integrating information from multiple systems. This is often a more painful exercise than initially expected because the business is attempting to use data for an entirely different Read more…

Stomping Out Criminal Scams with Hadoop

Dec 3, 2014 |

The growing technical sophistication of criminals is leading to an arms race to see who can scale more quickly to outmaneuver the other side. Cybercriminals are increasingly adopting hyperscale techniques to help them perpetrate fraud faster and more efficiently than ever before. That’s led the good guys to seek new capabilities of their own, including using Hadoop. One of the promising startups that’s looking to use the power of Hadoop and big data to put the kibosh on fraudsters and Read more…

Why Kafka Should Run Natively on Hadoop

Dec 2, 2014 |

Apache Kafka has become an instrumental part of the big data stack at many organizations, particularly those looking to harness fast-moving data. But Kafka doesn’t run on Hadoop, which is becoming the de-facto standard for big data processing. Now a group of developers led by DataTorrent are addressing that concern with a new project called KOYA, or Kafka on YARN. Getting Kafka into Hadoop would seem to be a no-brainer. After all, the open source message broker software already plays Read more…

News In Brief

Can Big Data Help Dispense Justice?

Dec 12, 2014 |

A debate within the American judicial system is focusing on the growing use of data-driven predictions about future crime risks in shaping sentencing guidelines. According to reports this summer, at least 20 states have adopted so-called “evidence-based sentencing.” Outgoing U.S. Attorney General Eric Holder raised concerns about the trend during the summer, calling on the U.S. Sentencing Commission, which establishes sentencing guidelines for the federal courts, to “study the use of data-driven analysis in front-end sentencing,” then issue policy recommendations. Read more…

Data Companies Work With Citizen Scientists on Climate

Dec 11, 2014 |

Storage vendor EMC Corp. is joining forces with big data and cloud specialist Pivotal and the EarthWatch Institute in an effort to apply analytical tools to the study the impact of climate change. The partners along with Schoodic Institute at Acadia National Park will study the interactions between nature and climate as part of a broader effort to promote citizen science using big data lakes, analytical tools and visualizations, the partners said. The “Big Data vs. Climate Change” initiative was Read more…

Mayo Clinic Eyes Data to Improve Health Care

Dec 10, 2014 |

Optum Labs, a health services and technology venture formed by the Mayo Clinic along with one of the nation’s largest health care insurers, is analyzing patient data to identify the best treatments, understand variations in health care and determine the effectiveness of patient care. The collaboration with the Mayo Clinic’s Kern Center for the Science of Health Care Delivery will scour the records of about 149 million patients covered by insurance giant UnitedHealth Group in an attempt improve patient care Read more…

MapR Claims Momentum as Hadoop Subs Grow

Dec 9, 2014 |

MapR Technologies said this week that paid subscriptions to its distribution of Apache Hadoop exceeded 700 in November while existing customers increased current subscriptions at an expansion rate of more than 200 percent during its most recent quarter ending Sept. 30. San Jose-based MapR has cashed in on the popularity of the open source framework for handling large-scale, data-intensive deployments with its top-ranked distribution for ApacheT Hadoop. Along with new subscribers, the company said it measures the growth of its Read more…

Data Tools Aid Open Source Intelligence

Dec 9, 2014 |

U.S. intelligence agencies and the military are increasingly leveraging analytics platforms based on machine learning to sift through data sources like social media. In the vernacular of the Pentagon, these efforts are generally referred to as open source intelligence initiatives. While the U.S. intelligence community is spending billions of dollars on geospatial intelligence—the analysis and exploitation of imagery and geospatial information—open source efforts focusing on unstructured data like web pages, emails, instant messaging and social media are augmenting those efforts. Read more…

This Just In

Apache Unveils Hadoop 2

Oct 17, 2013 |

Apache Software Foundation, which oversees the 150 or so open source projects under the famous Apache umbrella, this week announced Hadoop 2 – the latest version of the popular software framework for distributed computing.