Big Data • Big Analytics • Big Insight

Technologies » Systems

Features

The Rise of Predictive Modeling Factories

Feb 9, 2015 |

So you installed Hadoop and built a data lake that can store petabytes of data. Now what? According to leaders in predictive analytics, the best thing you can do is to build a predictive model factory that automates much of the drudgery out of running machine learning algorithms at scale. “Every data lake needs a predictive modeling factory,” says SriSatish Ambati, the co-founder and CEO of H2O, a developer of in-memory machine learning technology. “Predictive analytics as a whole is Read more…

Three Ways Big Data and HPC Are Converging

Jan 27, 2015 |

Big data is becoming much more than just widespread distribution of cheap storage and cheap computation on commodity hardware.  Big data analytics may soon become the new “killer app” for high performance computing (HPC). There is more to big data than large amounts of information.  It also pertains to massive distributed activities such as complex queries and computations (a.k.a analytics).  In other words, deriving value through computation is just as “big” as the size of the data sets themselves.  In Read more…

Rethinking Hadoop for HPC

Jan 26, 2015 |

Hadoop’s momentum has caught the eye of those in the high performance computing (HPC) community, who want to participate and benefit from the fast pace of development. However, the relatively poor performance and high latency of Hadoop applications is a real concern. To address the problem and make Hadoop a better fit for HPC resources, some are exploring how they can rewrite certain components of Hadoop in a more HPC-like manner. Those in the HPC world look at what’s happening Read more…

What 2015 Will Bring for Big Data

Jan 5, 2015 |

There’s no denying that 2014 was a big year in big data. The rapid maturation of technologies like Hadoop and Spark, along with the continual explosion of all types of data, led to an awakening of the potential of distributed data analytics among organizations, and also fueled a fresh eruption of venture money into startups. With that backdrop, the prospects for the new year certainly look promising, but what new analytic technologies and techniques will resonate in 2015? Datanami contacted Read more…

‘What Is Big Data’ Question Finally Settled?

Oct 29, 2014 |

People have been debating what “big data” means ever since the term appeared in the lexicon. Researchers at Cal Berkeley recently pepper dozens of prominent data scientists and industry leader with the question in hopes of settling the big question once and for all. You’ve undoubtedly heard many definitions for big data over the years. For some, you have big data when it can’t be stored on a single computer, or it’s the combination of volume, velocity, and variety (some Read more…

News In Brief

Project Myriad Brings Hadoop Closer to Mesos

Feb 12, 2015 |

One of the challenges of running Hadoop is resource management. The process of spinning up and managing hundreds, if not tens of thousands, of server nodes in a Hadoop cluster—and spinning them down and moving them, etc.–is way too hard to do manually. Automation must come to the table to help Hadoop take the next step forward in its evolution. The big question is how it will unfold. One answer to that question came to the forefront yesterday when a Read more…

Oracle Rejiggers Exadata for Emerging In-Memory Workloads

Jan 22, 2015 |

Organizations that adopt the latest generation of Oracle’s engineered systems will have more flexible configuration and licensing options available to them than previous generations. That will make it more cost effective to run emerging in-memory workloads, such as operational analytics, the company says. At a live launch event in Redwood City, California, Oracle chairman and CTO Larry Ellison unveiled the X5 generation of its various all-in-one server platforms. The Exadata Database Machine is headlining this new generation, and backed up Read more…

Hadoop RDBMS Ready for Primetime

Nov 19, 2014 |

Targeting database architects and application developers, Splice Machine Inc. announced general availability this week of its Hadoop database that had been in beta testing since May. The Hadoop relational database management system is designed to reduce the costs of developing real-time applications by, for example, eliminating the need for expensive proprietary hardware. The relational database for operational applications is being positioned as a replacement for Oracle and MySQL databases that might be running out of steam or are becoming too Read more…

Data Science Helps Troubleshoot the Datacenter

Oct 29, 2014 |

Data science is being enlisted to help troubleshoot scaled-up IT infrastructure. A data science platform unveiled this week by startup BigPanda seeks to use analytics to automate the increasingly complicated task of IT incident management. BigPanda said its platform analyzes the steady stream of daily IT alerts and essentially triages them into high-level alerts. The startup based in Mt. View, Calif., also announced a $7 million Series A funding round that will be used to accelerate data science product development. Read more…

Machine Learning Gets a Boost from Google

Oct 23, 2014 |

Search giant Google announced a partnership with Oxford University researchers that will target artificial intelligence applications such as image recognition and natural language understanding. Google said U.K. researchers with its recently acquired DeepMind initiative would work with Oxford AI specialists who earlier this year cofounded Dark Blue Labs. The cofounders, Nando de Freitas, Phil Blunson, Edward Grefenstette and Karl Moritz Hermann, are considering leading experts in the use of “deep learning” for machine understanding of natural language. The researchers “will Read more…

This Just In

Penguin Computing Announces Scyld ClusterWare for Hadoop

Feb 25, 2015 |

FREMONT, Calif., Feb. 25 — Penguin Computing, a provider of high performance, enterprise data center and cloud solutions, today announced Scyld ClusterWare for Hadoop, adding greater capability to the company’s existing Scyld ClusterWare high performance computing cluster management solution. “Scyld ClusterWare is ideal for managing HPC and Hadoop workloads for big data customers,” said Victor Gregorio, Vice President and General Manager of Cloud Services, Penguin Computing. “ClusterWare is the genesis of Linux-based supercomputing and represents the evolution of HPC using Hadoop Read more…

PHG Utilizing Tokutek’s TokuMX

Dec 2, 2014 |

LEXINGTON, Mass., Dec. 2 — Tokutek, delivering database performance at scale, today announced that Performance Horizon Group (PHG), a leading provider of partner marketing solutions for enterprises, has achieved significant capacity and performance improvements with TokuMX, the Tokutek distribution of MongoDB. Tokutek also has published a full case study “PHG Meets Capacity and Performance Challenges with TokuMX.” Performance Horizon Group provides a world-class affiliate marketing and partner management platform that enables large enterprises to connect directly with their online and mobile publishers at scale, Read more…

Cray Adds Cloudera Enterprise to Urika-XA System

Nov 19, 2014 |

SEATTLE, Wash. and BARCELONA, Spain, Nov. 19 — At the 2014 Strata + Hadoop World Barcelona conference, global supercomputer leader Cray Inc. today announced that Cloudera Enterprise is now pre-integrated into Cray’s new big data analytics appliance — the Cray Urika-XA system. Cloudera Enterprise, the foundation for Cloudera’s enterprise data hub offering, includes Cloudera’s complete and tested distribution of Apache Hadoop. It is the most widely adopted Hadoop-based platform in the world, with Cloudera being the most prolific contributor to the open source Read more…

Seagate Announces ClusterStor Hadoop Workflow Accelerator

Nov 17, 2014 |

CUPERTINO, Calif., Nov. 17 — Seagate Technology plc, a world leader in storage solutions, today announced availability of the ClusterStor Hadoop Workflow Accelerator, a new solution providing the tools, services, and support for High Performance Computing (HPC) customers who need the best performing storage systems for Big Data Analytics. The Hadoop Workflow Accelerator is a set of Hadoop optimization tools, services and support that leverages and enhances the performance of ClusterStor™, the market leading scale-out storage system, designed for Big Data Read more…

X-ISS Announces New Release of DecisionHPC Analytics Software

Nov 11, 2014 |

HOUSTON, Tex., Nov. 11 – eXcellence in IS Solutions Inc. (X-ISS), a provider of High Performance Computing and Big Data solutions, today announced Version 14.2 of its DecisionHPC business analytics software. New functionality gives DecisionHPC clients greater customization, improved usability, and easier access to critical cluster performance information. X-ISS will officially release DecisionHPC 14.2 by year end and will be demonstrating several of the new capabilities in booth #3760 at SC14, an international conference of the HPC industry being held November 16-21, Read more…