Follow Datanami:
August 31, 2012

This Week’s Big Data Big Five

Datanami Staff

This week’s Friday Big Data Top Five delivers news from some of the industry’s top newsmakers and for this edition, brings items from some newer companies that are beginning to make waves. We touch on a new big data management suite for HPC and big data deployments, fresh technology to make the design of high performance hardware for big data processing faster, new ways to accelerate Hadoop and more…

Without further delay, let’s look first to the clouds…

CloudFuzion adds Big Data Management to its HPC, Cloud Solutions

CloudFuzion team, which focuses on high performance cloud computing solutions for computational grids, private and public cloud environments announced that it now has released it’s CloudFuzion Rapid Transit module to significantly accelerate Big Data feeds to and from cloud based compute clusters and render farms. 

Big Data is one of the obstacles to the widespread deployment of cloud based render farms and computational clusters. It takes time to move the Gigabytes of data required to execute a scene render in the 3D animation arena, or to complete the analysis of a section of the electric grid for electric power management. As the business demand for compute power steadily increases so do the constraints of improving the productivity, reliability and usability of these cloud based solutions, while at the same time reducing costs. CloudFuzion proves that such high demands placed on cloud providers from external factors like Big Data can not only be met but far surpass expectations. 

“Moving  massive large scenes, also known as Big Data, to and from these remote on-line render farms is a basic requirement and can be very challenging. Our CloudFuzion Rapid Transit module handles all movement, encryption of Big Data automatically and we move a typical 1.5 Gigabyte scene plus assets and media to a remote cloud based render farm in 11 minutes, versus in excess of 1.5 hours using normal communications methods”, commented Mr. Duffy.

CloudFuzion provides for mission critical, fail-safe, fully redundant HPC clusters in private and public clouds for the 3D Animation and Power/Energy Industry. Cloud cluster deployment and workflow/pipeline integration services available from the CloudFuzion team. 

NEXT — NEC Technology Enables Rapid Design of High-Speed Big Data Processing Hardware >


NEC Technology Enables Rapid Design of High-Speed Big Data Processing Hardware

NEC Corporation has developed technology that they say enables users to quickly and easily design hardware for the high-speed real-time analytical processing of big data, even if they do not have expert knowledge.

In recent years, expectations have increased regarding the analytical processing of big data, which provides added value by processing and analyzing large quantities of time-series data on a real-time basis. This type of processing has been launched in areas that include the automated trading (algorithmic trading) of securities companies’ stock and the analyses of traffic by network providers. The utilisation of this processing is also being considered in fields such as healthcare and public security. These industries require high-speed analytical processing for accurate information analysis, even as data volume increases. This processing must also flexibly respond to rapidly changing analysis requests.

The use of dedicated hardware is expected to accelerate the processing of big data by 10 to 50 times compared with processing using software. However, the development and implementation of new hardware has historically taken a significant amount of time, often requiring several months. Insufficient flexibility has also plagued hardware as it did not permit changes to content during processing.

“NEC’s newly developed technologies permit the easy design of hardware dedicated to high-speed processing. In order to carry out data analysis, users simply input the required analytical processing content using SQL, a programming language used as an open interface,” said Naoki Nishi, General Manager, Green Platform Research Laboratories, NEC Corporation. “This reduces the time for developing hardware from as much as several months to approximately 1/50 the time, or several hours. The technology also reduces the time required for rewriting the processing content to approximately 1/1,000,000 of the previous time, or from milliseconds to nanoseconds, allowing the dynamic operation of the system without shutting it down.”

NEC has developed design technology that permits the use of CyberWorkBench, a circuit synthesis technology for the Field Programmable Gate Array (FPGA) owned by NEC, to automatically convert software designed using SQL, which is widely used for the analytical processing of big data, to dedicated hardware on FPGA. This permits users in charge of data analysis to design FPGA directly using SQL, a familiar programming language. This means that circuit design work by engineers is no longer necessary, and the time required for developing hardware, which conventionally takes several months, is reduced to approximately 1/50 of the time, or several hours.

NEC has also developed a hardware processing mechanism that switches between data processing content instantaneously, while operating the circuit for the old processing and a circuit for the new processing in parallel. Previously, processing had to be halted for several milliseconds when changing processing content since the circuit for the old processing had to be reconfigured for the new processing. The newly developed mechanism reduces the shutdown time to nanoseconds, or approximately 1/1,000,000 of the previous time. This enables dynamic corrections and processing changes without shutting down the system.

NEC will continue to carry out research and development with the aim of providing the technology as a real-time big data processing solution using hardware by the close of the fiscal year ending March 2015.

NEXT — GridIron Systems and Zettaset to Accelerate High Performance Hadoop Installations >


GridIron Systems and Zettaset to Accelerate High Performance Hadoop Installations

GridIron Systems announced that it is collaborating with Big Data software innovator, Zettaset, to develop a reference architecture for virtual Hadoop clusters utilizing Flash to improve the efficiency of inter-node communications. The reference architecture provides Hadoop users with an efficient, pre-configured alternative to building and configuring racks of servers, storage, and software for large-scale deployments.

“Together with Zettaset, we can provide a pre-configured Hadoop-cluster-in-a-box that is a better value option than a DIY build. Because the high-bandwidth iNode uses Flash in a way that is optimized for Big Data environments and virtualization, there is an immediate reduction in power and data center footprint.”

“Hadoop is the great enabler for complex analytics that require the aggregation of huge amounts of data from many sources,” said Jim Vogt, CEO at Zettaset. “However, as Hadoop clusters continue to grow, they can suffer from performance degradation. Zettaset has worked closely with GridIron to build a powerful infrastructure platform that combines high density and high performance in a single powerful package. Users can now concentrate on building business value from their data and not worry about the underlying platform.”

The virtual Hadoop cluster architecture integrates the Zettaset Orchestrator software with the GridIron OneAppliance iNode. Orchestrator optimizes the performance, security, and health of Hadoop. iNode provides the high-performance compute, storage, and high-bandwidth connectivity Hadoop requires in a Big Data appliance. Both iNode and Orchestrator support virtualization, so users can quickly and easily adapt the platform to meet their specific business needs.

The GridIron OneAppliance iNode integrates high-performance computing with engineered high-bandwidth Flash data storage to deliver the industry’s first all-Flash ultra-high bandwidth Big Data appliance. It’s the highest performance server Flash system available in the industry today. This revolutionary approach breaks through the imbalance of current server-Flash solutions by providing unprecedented scale and bandwidth to meet the demands of multi-core servers. The iNode is a single 10-rack unit system that can simplify and consolidate sprawling clusters of storage and servers without sacrificing performance.

The Zettaset Orchestrator Big Data platform accelerates time to value with an enterprise-ready solution which has been hardened to achieve higher levels of performance, security, and availability. Orchestrator’s automated approach enables Hadoop clusters to be operational within hours, reduces unnecessary dependencies on professional services, and dramatically lowers IT expenses. Orchestrator also has the flexibility to work with any of the major Hadoop distributions currently available.

“This new reference architecture solves key Big Data challenges, such as time pressure, resource availability and cost-efficiency” said Som Sikdar, CTO at GridIron Systems. “Together with Zettaset, we can provide a pre-configured Hadoop-cluster-in-a-box that is a better value option than a DIY build. Because the high-bandwidth iNode uses Flash in a way that is optimized for Big Data environments and virtualization, there is an immediate reduction in power and data center footprint.”

NEXT — Fusion-io Delivers Open Virtualization Systems with ioTurbine Caching Software >


Fusion-io Delivers Open Virtualization Systems with ioTurbine Caching Software

Fusion-io nnounced this week that its Fusion ioTurbine virtualization caching software now offers intelligent application caching for the Linux guest operating system (OS), in addition to support for the Microsoft Windows guest operating system. Fusion ioTurbine software delivers improved VM density and cost savings in VMware environments, including those that require vMotion, enabling customers to virtualize even data-intensive applications with significant performance improvements.

With this update, Fusion ioTurbine uniquely offers complete cross-platform operating system support while providing full VMware vMotion compatibility for virtualized servers. Fusion ioTurbine offers dynamic rebalancing of flash capacity as VMs come and go to optimize use of resources, while supporting vMotion and the movement of VMs from host to host. By tightly coupling with the file system I/O routines in the guest operating system, ioTurbine transparently redirects I/O patterns so flash storage is shared across all hosted virtual machines (VMs). Fusion ioTurbine also works with Fusion ION Data Accelerator software to enable customers to architect shared software defined virtualization platforms using the servers best suited to their enterprise needs.

“Fusion ioTurbine is the only open server-side caching software on the market today that supports both Linux and Windows guests simultaneously while maintaining seamless vMotion operation,” said Neil Carson, Chief Technology Officer, Fusion-io. “Software defined virtualization solutions like Fusion ioTurbine and Fusion ION Data Acceleration are designed to allow IT experts to affordably build virtualization platforms customized to their enterprise, making it possible to scale infrastructure without compromising on performance.”

Fusion ioTurbine now also features a new plug-in for VMware vCenter Server, the scalable and extensible VMware virtualization management platform, enabling tighter integration and ease of use within VMware environments. With the VMware vCenter Server plug-in, IT administrators can centrally and transparently manage ioMemory in VMware environments for dramatically improved control over the virtual environment.

Compatible with Fusion ioTurbine, Fusion ION Data Accelerator software transforms industry-leading server platforms into powerful network shared flash data acceleration appliances. With ioTurbine and ION integration, data cached on ioTurbine can be supported by a Fusion ION Data Accelerator enabled server to provide an open virtualization platform built on the systems customers know and trust. These combined software solutions enable enterprises to efficiently virtualize even data-intensive applications in VMware environments, ushering in a new era of software defined storage.

 NEXT — ScaleXtreme Advances Big Data Systems Management >


ScaleXtreme Advances Big Data Systems Management

ScaleXtreme, which provides cloud-based monitoring and systems management products, has made two major advances on its Big Data-enabled systems management product that they say will simplify and accelerate IT administration and operations.

As first announced in June, ScaleXtreme is building brand-new “big data” insights systems management functionality that takes unique advantage of the cloud-based delivery system. The new functionality helps IT operators put systems events into perspective and prioritize their work.

This new functionality is available to select customers in early access, with broader availability of this functionality available soon. The company’s advances include system configuration recommendations and intelligent monitoring thresholds.

“These new features represent a quantum shift in systems management,” said ScaleXtreme CTO Balaji Srinivasa. “This powerful new functionality simply isn’t possible from an older, on-premise software product.”

Process driven IT management is quickly being replaced by “event driven” IT management. The proliferation of servers and the variety of stacks makes it impossible to work through a prescribed process in a resource-constrained environment. When emergencies drive work, qualitative validation, triage and prioritization of these events become important.

ScaleXtreme gives customers unprecedented insight and data-driven prioritization tools. The company allows customers to opt-in to an anonymous data-sharing network that gives them the benefits of a many-to-many information exchange. The company uses powerful new Big Data analytics to surface the most useful IT management practices that apply directly to each user. Customers can spot emerging trends and make data-driven decisions about the actions they should take to respond to infrastructure issues.

“Customers are beginning to demand this type of collective intelligence and decision-making capability,” said ScaleXtreme CEO Nand Mulchandani. “We’re going to continue working on this next-generation systems management, so expect more announcements in the coming months.” These features are already available for select customers and will be coming to the wider user base soon:

Configuration Recommendations: ScaleXtreme identifies the most used combinations of operating systems and applications and can advise which packages and updates work best together. This gives systems administrators an unprecedented source of objective information to help drive decisions about endpoint setup, configuration and stack compatibility.

Intelligent Monitoring Thresholds: ScaleXtreme helps IT operators and administrators optimize their monitoring and alerting capabilities fast by base-lining performance to shared historical norms. The product reviews machine configurations and capabilities and presets alerts to the levels accepted by others running similar setups. This can dramatically accelerate the deployment of ScaleXtreme’s world-class monitoring product.

Datanami