Follow Datanami:
June 1, 2012

This Week’s Big Data Big Seven

Datanami Staff

Thanks for joining us as we round out the month of May and plunge into summer with this week’s big seven stories.

To kick things off, we look to Dell, which is showing off its ARM strength for the needs of Hadoop and web-scale customers, look to Cloudera’s capacity to handle the unending streams of social data, test the transaction speed processing waters and find new developments from the visualization side of the fence.

Without further delay, let’s start with Dell’s forward-looking foray into the high-efficiency server market..

Dell Flexes ARM for Big Data

This week Dell announced that strong customer demand and the belief that the ARM ecosystem is at an inflection point has prompted it to deliver new ARM-based servers. The company says that the efficiency angle is key to attracting web-scale customers with big data problems to solve.

Dell announced that it would deliver servers to key ecosystem partners to serve these needs. These partners include Cloudera, the company behind one of the more popular supported distributions of Hadoop.

Dell says it believes ARM infrastructures demonstrate promise for web front-end and Hadoop environments, where advantages in performance per dollar and performance per watt are critical. They note, however, that the ARM server ecosystem is still developing, and largely available in open-source, non-production versions, so the current focus is on supporting development of that ecosystem.

According to Amr Awadallah, cofounder and CTO of Cloudera, “As the leader in Apache Hadoop and Big Data systems, we are continuously seeking new technologies that can help our big data platforms operate at the next level of efficiency. We are very excited about the ARM-based server line from Dell, as this technology will allow our customers to pack more processing heft into a smaller data center footprint and do so with a significantly lower energy consumption profile.”

The company has been testing its ARM server technology since 2010 in the wake of increased demand for more power-efficient server offerings for its hyperscale customers.  Dell also says they will continue delivery of the Dell “Copper” ARM server to select customers and partners. There are existing remote-accessible Copper server clusters deployed in Dell Solution Centers, and through its partnership with the Texas Advanced Computing Center (TACC).

Dell says it plans to deliver an ARM-supported version of Crowbar, their open-source management infrastructure software, in the foreseeable future.

Next — Cloudera Powers Social Data Sifter >>


Cloudera Powers Social Data Sifter

This week Hadoop ecosystem vendor, Cloudera, announced a partnership with social data platform startup, Datasift to mine for insights from the unending streams of social media data.

DataSift is powering its Hadoop clusters with Cloudera’s Distribution Including Apache Hadoop (CDH), which performs the big data heavy lifting to help deliver DataSift’s Historics, a cloud-computing platform that enables entrepreneurs and enterprises to extract business insights from historical public tweets.

With Cloudera’s brand of Hadoop at its core, DataSift’s platform evaluates each social interaction from multiple dimensions, applying natural language processing to turn unstructured data into structured, digestible information ready for analysis to identify sentiment, topics, web-links, location and social media influence.

According to a release this week Cloudera’s Hadoop (CDH) is supporting over half a petabyte of relevant data for DataSift.

According to Nick Halstead, founder and CTO at DataSift. “The integration of Cloudera’s technology into DataSift provides us with a robust, enterprise platform that enables our customers to ask and answer these questions in minutes, whether they are analyzing data from last week or last year.”

Next — A New Data “Hero” on the Block >>


New Data “Hero” on the Block

Startup Datahero garnered headlines this week with the announcement that it secured $1 million in seed. The company claims they’ve created the first web application that allows anyone to visualize and understand their data.

Datahero, which was founded in 2011 by big data star Chris Neumann and consumer experience expert Jeff Zabel, started alpha testing its data analytics platform recently with the plans to open it later on this year.

“We believe that Datahero has taken a radically different approach to data analysis by combining a guided user experience with powerful analytics in a way that will bring sophisticated data analysis capabilities to a much broader audience than today’s tools allow,” said Foundry Group Managing Partner Ryan McIntyre.

In what sounds a lot like Tableau Public, the company says users only need a web browser to import information from a variety of popular online services, load reports from corporate databases and even upload spreadsheets and other files from their computer directly into Datahero. Datahero’s algorithms automatically categorize and tag the information, driving an interface that enables users to focus on questions and answers instead of data formats and SQL.

“Understanding data is becoming an increasingly crucial part of our daily professional lives, yet recent reports show fewer than 30 percent of potential users of organizations’ BI tools use them today,” said Chris Neumann, CEO and cofounder of Datahero. “Until now, the ‘last mile’ problem in analytics, enabling the non-expert to do their own data analysis, has remained unsolved. With Datahero, users of any ability can quickly and easily visualize the data that matters the most to them, without needing a data scientist or an IT department. We literally enable any user to be their own data hero.”

Next — Microgen Tests the Transaction Processing Speed Waters >>


Microgen Tests the Transaction Processing Speed Waters

For a company founded in the early 1970s, British company Microgen has had to adapt to almost every major technology shift since the dawn of widescale enterprise IT.

This week they targeted the newest rush—big data. The company released Microgen Aptitude, an enterprise application platform that their tests reveal achieved the result of processing 7 billion transactions per hour performing in-memory processing on IBM System x.

 Further tests, involving Oracle database to database processing running across a combination of IBM System x and IBM Power Systems, resulted in Microgen Aptitude processing over 800 million transactions per hour. The tests were conducted over a three week period at the IBM Innovation Centre at Hursley in Hampshire, UK.

“Performance is at the heart of our software because we’re aware that for many enterprises, delivering new products and operational processes requires IT systems to process, manage and manipulate an exponentially increasing volume of data. There are products now addressing “big unstructured data”

The test results demonstrate that Microgen Aptitude and the Microgen Accounting Hub (“MAH”) are able to exploit the large number of processors, sophisticated storage systems and the high performance of the IBM platforms. According to Microgen, the results of these test runs hold value for those with big data needs in finance, telecom, utilities and digital media.

According to the company’s CTO, Neil Thomson,There are products now addressing “big unstructured data” but the majority of a company’s critical data consists of structured transactions.” He says that Microgen Aptitude is “designed to process these at speeds which make ‘impossible things possible’ – intra-day P&L’s, dynamic pricing treatments, daily product and customer profitability analyses.”

Next — Kognito Visualizes Future >>


Kognito Visualizes Future

Kognito, which taps in-memory analytics for big data and cloud environments announced a partnership with  and Advanced Visual Systems, the developer of data visualization software that presents diverse forms of data as innovative and easily understandable graphic representations.

The two companies will focus the new partnership on vertical industries such as advertising, consumer behavior and social media. Kognitio’s in-memory analytical platform will be paired with AVS’ advanced API allowing users to exploit extremely complex data and visualize the results in ways that reveal previously undetectable patterns, trends and anomalies.

Kognitio says that its ability to analyze multiple terabytes of data in a fraction of the time of competing solutions, coupled with AVS’ visualization capabilities enables departmental, enterprise and ISV product teams to achieve new levels of decision quality from large and complex data warehouses, real-time streams, unstructured data and complex networks.

“Large and complex data have become critical components of many analytic projects. Most data visualization tools, however, tend to over-aggregate data in order to present it,” said Steve Sukman, chief executive officer of AVS.

Sukman said that his company’s approach is to “provide thousands of techniques for visualizing granular, filtered and analytically-enhanced data to produce faster and more confident decisions from large and complex data. Kognitio’s superior performance allows granular data to be utilized effectively and opens up a world of new possibilities for our clients to capitalize on high impact data visualization.”

Next — Tapping into SAP with Tableau >>


Tapping into SAP with Tableau

Business intelligence and visualization company Tableau Software announced numerous improvements to its platform designed to help users facing the challenges of large deployments. Among the updates in 7.0.4 is a native connector to SAP BW.

Tableau says  that their native SAP BW connector takes full advantage of the metadata and security stored in BW, which gives SAP and Tableau users cleaner, more secure access to their data for quicker decision-making and insights.

The companies tout the high availability angle to the update, noting that a Tableau Server installation can now withstand the failure of a single machine with no need for human intervention. We hope you never have to put that claim to the test.

Tableau has also updated its free application, Tableau Public, with all of the same features. Tableau Public lets anyone create and share data visualizations on the web for free.

Next — IBM’s Geo-Enabled Future >>


IBM’s Geo-Enabled Future

At the IBM Pulse Conference in Sydney, Australia this past week, IBM Vice President Steve Mills talked about data and how his company will not only help users manage it, but create it as well.

One way to enrich data is through geolocation tools and Mills said this year geospatial integration will be a core part of the software giant’s future strategy.

As he stated, “IBM doesn’t follow international business trends, it drives them – and their position reflects a growing trend among the world’s biggest companies towards geo-enabling their business systems by integrating their asset management and Geographic Information System (GIS) technology platforms.

According to Esri Australia Business Manager Francisco Urbina, “there has been such a significant uptake of this approach globally that IBM have developed new software platforms that seamlessly integrate with Esri GIS technology.

Urbina said that “Locally, we’re seeing organisations, such as Northern Territory Power and Water, embrace these integrated technologies to geo-enable their systems.”  Geo-enabling business systems involves using GIS technology to translate an organisation’s data into the visual format of a map.  Mr. Urbina said geo-enabling business systems challenged enterprises to completely re-examine how they viewed their assets.

“GIS technology plots an organization’s asset data on a map and enables managers to visualize relationships among assets and other mapped features, such as roads, buildings and pipelines,” Mr. Urbina said.

Datanami