Applications » Data Mining

Features

Big Data’s Dirty Little Secret

Jul 2, 2015 |

The twin phenomena of big data and machine learning are combining to give organizations previously unheard of predictive power to drive their businesses in new ways. But behind the big data headlines that tease us with tales of amazing insight and business optimization lurks an inconvenient truth: raw data is very dirty and requires an enormous amount of effort to clean. Data scientists are undoubtedly the rock stars of the big data movement, as they use their keen understanding of Read more…

DDN Tackles Enterprise Storage Needs as ‘Wolfcreek’ Looms

Jun 30, 2015 |

When it comes to keeping supercomputers fed with data, there are few storage makers that can keep up with DataDirect Networks. But increasingly, DDN is feeling pressure from enterprises that are struggling to keep up with the ongoing data explosion and mixed I/O workloads. That’s where DDN’s forthcoming high-end storage array for the broader enterprise market, codenamed “Wolfcreek,” comes into play. Wolfcreek is DDN‘s next generation converged architecture for enterprise customers. The system borrows technology from DDN’s SFA12k line of Read more…

One on One with LinkedIn’s VP of Engineering

Jun 29, 2015 |

Why are data scientists tripping over themselves to get their hands on LinkedIn’s data? What’s it like to run one of the world’s biggest social media sites, and how can machine learning algorithms contribute to the creation of economic opportunity for a global workforce? We recently posed those questions (and more!) to Igor Perisic, Vice President of Engineering at LinkedIn. Alex Woodie: Igor, thank you for agreeing to this interview. First, please tell us about yourself and your role at Read more…

Data Lake Showdown: Object Store or HDFS?

Jun 23, 2015 |

The explosion of data is causing people to rethink their long-term storage strategies. Most agree that distributed systems, one way or another, will be involved. But when it comes down to picking the distributed system–be it a file-based system like HDFS or an object-based file store such as Amazon S3–the agreement ends and the debate begins. The Hadoop Distributed File System (HDFS) has emerged as a top contender for building a data lake. The scalability, reliability, and cost-effectiveness of Hadoop Read more…

Hortonworks Tightens Up Its Distro for Enterprise Adoption

Jun 11, 2015 |

Hortonworks unveiled the first new release of its Hadoop distribution in six months earlier this week. With Hortonworks Data Platform (HDP) 2.3, the company is focusing on strengthening security, governance, and operations, and just generally making Hadoop easier and more visual to use. A lot has transpired since Hortonworks shipped HDP 2.2 last December, including the company’s IPO in December, the formation of the controversial Open Data Platform (ODP) industry consortium in February, the opening of a new headquarters in Read more…

News In Brief

Kyvos Debuts OLAP for Hadoop

Jun 30, 2015 |

Many technology pros view OLAP as a legacy technology, a holdover from the days of data warehousing that doesn’t have a place in today’s big data world. But several startups are fighting to change that perception, including Kyvos Insights, which today unveiled its OLAP-on-Hadoop solution. Twenty years ago, online analytical processing (OLAP) was the center of many enterprise data warehouse (EDW) initiatives. The technology, which is largely synonymous with the term “multi-dimensional database,” gave organizations a way to pre-index and Read more…

Zettaset Patents Data Access Approach

Jun 26, 2015 |

Big data security specialist Zettaset said it has been awarded a U.S. patent for a technique designed to boost data access and performance in distributed computing frameworks like Hadoop and NoSQL. Zettaset said this week the U.S. Patent and Trademark Office issued a patent for its DiamondLane technology on June 23. The U.S. patent covers “distributed storage medium management for heterogeneous storage media in high availability clusters” (U.S. Patent No. 9,063,939-B2). Zettaset, Mountain View, Calif., said DiamondLane could be used Read more…

Lockheed Martin, Data Vendors Team on Secure Spy Database

Jun 25, 2015 |

Geospatial intelligence is among the hottest and most data-intensive tools being used by U.S. military analysts to sweep up huge amounts of satellite and other sensor imagery. This highly classified data is often combined with other emerging intelligence sources like social media. Much of the satellite imagery is highly classified to shield prying eyes from capabilities like image resolution and operational details like spectrum frequencies being used. Hence, there is a growing need for advanced databases with multiple levels of Read more…

Software AG Platform Targets Predictive Analytics

Jun 23, 2015 |

Software AG is adding high-end predictive analytics tools to its Digital Business Platform as it looks to extend the reach of the flagship analytics platform across retail, manufacturing, financial services and other industry sectors. The German software vendor said Tuesday (June 23) it has embedded Adaptive Decision and Predictive Analytics (Adapa) from San Diego-based Zementis Inc. into its Apama Streaming Analytics platform in a bid to offer enterprises a “one-stop shop” for business intelligence and analytics. Adapa is billed as Read more…

Analytics Arms Race: Cards Accused of Hacking Astros’ Database

Jun 16, 2015 |

Baseball has thrived for decades on an early form of data analytics known as Sabermetrics. The reliance on data analysis has only grown in the years since Billy Beane of the Oakland Athletics pioneered the use of statistical analysis, or what popularly came to be known as Moneyball. Now it appears that the professional baseball’s positional arms race to stay one step ahead of the competition via analytics has taken a troubling turn: The National League St. Louis Cardinals, among Read more…

This Just In

SAS Factory Miner Launched

Jun 17, 2015 |

CARY, N.C., June 17 — With a well-documented shortage of skilled data scientists, a logical response is to boost the productivity of the ones we have. Analytics leader SAS now offers data scientists fast, automated creation of analytics models. These models can help a retailer identify the best customers for a marketing campaign, for example, or a health insurer uncover fraudulent claims. Part of a broad update to SAS Analytics, SAS Factory Miner software maximizes productivity of data science teams. Users work Read more…

Novetta Releases Entity Analytics v2.5

Jun 9, 2015 |

MCLEAN, Va., June 9 — Novetta today released Novetta Entity Analytics version ​2​.5, an enhanced entity resolution and data analysis application that delivers enhanced customer analytics, security threat assessments, and fraud detection capabilities. Novetta Entity Analytics version ​2​.5 streamlines data integration, entity resolution, and relationship analysis processes to make it easy for even novice users to combine and resolve disparate data sets and explore​ t​he underlying content and context.​ “Novetta Entity Analytics v2.5 has new visualization tools and other features designed Read more…

Dataguise Introduces Big Data Security Suite

Jun 9, 2015 |

FREMONT, Calif., June 9 — Dataguise, the leading provider of data-centric discovery and data protection for Hadoop and other Big Data environments, today announced the release of DgSecure version 5.0. The latest generation DgSecure platform allows businesses to scale sensitive data discovery, automate data protection, and achieve a 360-degree view of their sensitive data assets across Big Data and traditional data repositories both on-premise and in popular cloud platforms. DgSecure version 5.0 includes significant enhancements, including the addition of industry-first security Read more…

BlueTalon Policy Engine 2.0 Now Available

Jun 9, 2015 |

SAN JOSE, Calif., June 9 — BlueTalon, provider of unmatched data security solutions for Hadoop, today announced availability of the BlueTalon Policy Engine 2.0. The technology ushers in a new era of truly secure Hadoop clusters by solving the single biggest source of data breaches – lack of fine grained data access controls – to make Hadoop security on par with or better than traditional enterprise data warehouse security. Forty-two percent of all data breaches are traced back to inadequate data Read more…

DataTorrent Joins the Open Data Platform Initiative, Brings Real-Time Processing of Data-In-Motion to ODP

May 21, 2015 |

SANTA CLARA, Calif., May 21, – DataTorrent, the leader in real-time big data analytics and creator of DataTorrent RTS, the world’s first enterprise-grade unified platform for both stream and batch processing on Hadoop, today announced it joined the Open Data Platform (ODP) as a Silver member. In addition, DataTorrent RTS is certified and immediately available on the ODP Core. DataTorrent is focused on reducing time to market, increasing ROI and enabling success of big data projects. Joining ODP brings DataTorrent’s Read more…