Big Data • Big Analytics • Big Insight

Applications » Enterprise Analytics


Training Day: CrowdFlower Sets Human-Generated Data Free

Mar 4, 2015 |

Data scientists who are looking for high quality sets of curated data on which to train their machine learning models may want to check out CrowdFlower, which today unleashed a veritable treasure trove of free human-generated data. CrowdFlower today released about 40 data sets as part of its Data for Everyone campaign (see But over the coming weeks, the San Francisco company expects to make thousands of data sets available for download from its website, covering millions of records. Read more…

How to Get a ‘Network Effect’ from Your Big Data Lake

Mar 3, 2015 |

One of the hidden benefits of being a data-driven organization is a so-called “network effect” that occurs around data and analytics. When an organization has several successful big data analytics projects under its belt, it often becomes easier to see how data can be used to benefit the organization in profound new ways. Creating a Hadoop-based data lake is often the first step in going down the big data analytics road. Without data and a place to put it—often a Read more…

Novetta Throws Entity Analytics Hat Into Hadoop Ring

Mar 2, 2015 |

One of the new big data analytic vendors exhibiting at the recent Strata + Hadoop World conference was Novetta, a firm that’s well-known in the Washington D.C. area for its cyber analytic offerings. But now the company is widening its reach into the commercial market with a Hadoop-based solution called Novetta Entity Analytics. One of Novetta’s first customers in the big data space was an unnamed government security agency that was having trouble pulling useful information out of an 8-billion Read more…

Rating the Advanced Analytics Vendors

Feb 27, 2015 |

There are several ways you can go about obtaining the advanced analytic capabilities needed to extract insights from large amounts of data. You can outsource the whole thing to a services firm, you can buy pre-built applications for a specific industry, or you can buy tools that will let you build what you need. Last week, Gartner rated the top 16 such build-it-yourself tools in the advanced analytics category. The “Magic Quadrant for Advanced Analytics Platforms” that Gartner delivered last Read more…

Spark Steals the Show at Strata

Feb 25, 2015 |

There was a lot of good stuff on display at last week’s Strata + Hadoop World conference. But if there was one product or technology that stood out from the pack, that would have to be Apache Spark, the versatile in-memory framework that is taking the big data world by storm. At Strata, Spark creator Matei Zaharia showed how the technology will get even more powerful in the months to come. Spark has garnered an incredible amount of momentum, largely running Read more…

News In Brief

Where Does InfiniDB Go From Here?

Mar 5, 2015 |

Last September, the company behind InfiniDB, Calpont, went out of business. Up stepped MariaDB, the company behind the open source relational database, to serve as a steward for the product and provide support to customers. The big question on everybody’s mind is, where does the product go from here? InfiniDB is a columnar database management system designed to power analytic applications. Originally debuting in the year 2000, the software was built upon the MySQL database and includes its own SQL Read more…

Fujitsu Adding Column-Oriented Processing Engine to PostgreSQL

Mar 4, 2015 |

Fujitsu Laboratories last week announced that it’s developed a column-oriented data storage and processing engine that can quickly analyze large amounts of data stored on a PostgreSQL database. The technology, which utilizes vector processing, is being showcased this week at a conference in Japan. Fujitsu has a long history developing big systems designed to handle heavy transactional loads. It was a close development partner of Sun Microsystems for 64-bit Sparc servers until Sun was acquired by Oracle. Today the $4.5 Read more…

Actian Claims ‘Permanent Performance Advantage’ with SQL-on-Hadoop Tool

Mar 2, 2015 |

The SQL-on-Hadoop sweepstakes are by no means over. What’s been dubbed the “gateway drug” for Hadoop is just starting to gain traction. But according to Actian, its SQL-on-Hadoop offering, dubbed Vortex, is out to an early–and permanent–lead in the performance department. At the recent Strata + Hadoop World show, Actian pitted Vortex against Cloudera’s Impala right in the booth, where it largely re-created the results of a 2014 TPC Decision Support (TPC-DS) benchmark test that showed Vortex completing a job Read more…

U.S. Names First Chief Data Scientist

Feb 24, 2015 |

An industry veteran and college math professor who is partially credited with coining the title “data scientist” has been named the nation’s first chief data scientist. The White House announced the appointment of DJ Patil to the new post last week. Patil also will serve as the Obama administration’s deputy chief technology officer for data policy, the White House said. Patil most recently served as a vice president at RelateIQ, a customer relationship management specialist acquired by Salesforce in July Read more…

Snowflake Differentiates Itself in Strata Startup Showcase

Feb 23, 2015 |

Snowflake Computing, a big data warehousing as a service provider, took home top honors at the Startup Showcase event held during last week’s Strata + Hadoop World conference. The award is a boost to the Silicon Valley company, which aims to be a one-stop shop for analyzing data generated on the cloud. Snowflake emerged from stealth mode in October with $26 million in cash and a vision to create an “elastic data warehouse” that lives in the cloud. The company, Read more…

This Just In

ASF Unveils Apache HBase v1.0

Feb 24, 2015 |

FOREST HILL, Md., Feb. 24 — The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache HBase v1.0, the distributed, scalable, database for Apache Hadoop and HDFS. “Apache HBase v1.0 marks a major milestone in the project’s development,” said Michael Stack, Vice President of Apache HBase. “It is a monumental moment that the army of contributors who have made this possible should all be proud Read more…

Informatica Becomes Part of Capgemini and Pivotal’s Business Data Lake Ecosystem

Feb 24, 2015 |

Feb. 24 — Capgemini, one of the world’s foremost providers of consulting, technology and outsourcing services announced today that Informatica is now part of the Business Data Lake ecosystem developed by Capgemini and Pivotal. Customers worldwide will now be able to leverage Informatica’s data integration software in addition to Pivotal’s advanced big data, analytics and application software, and Capgemini’s industry and implementation expertise. Informatica will deliver certified technologies for Data Integration, Data Quality and Master Data Management (MDM) to help enterprises distill raw data into actionable insights. Read more…

GainInsights Joins Birst Partner Ecosystem

Feb 24, 2015 |

SAN FRANCISCO, Calif. and BANGALORE, India, Feb. 24 — Birst, the global leader in Cloud BI and Analytics, today announced a partnership with GainInsights, a business intelligence consulting firm, to deliver Birst’s comprehensive business intelligence (BI) platform to organizations in India, helping executives drive better business execution. GainInsights will resell, implement and deploy Birst’s Cloud BI, bringing a new approach to BI and analytics that enables organizations to operationalize more data and make it available at every decision point. GainInsights’ experienced BI Read more…

Databricks and Intel Collaborate

Feb 20, 2015 |

SAN JOSE, Calif., Feb. 20 – Databricks, the company founded by the creators of the popular open-source Big Data processing engine Apache Spark with its flagship product, Databricks Cloud, today announced plans to collaborate with Intel to optimize Spark real-time analytic capabilities for Intel architecture. Enterprises are increasingly developing applications to extract real-time insights from large data sets. The necessity for real-time analytics across Intel architecture is a vital piece of the Big Data puzzle to enable the extraction of prompt, actionable Read more…

Cloudera and Deloitte Announce Alliance

Feb 20, 2015 |

PALO ALTO, Calif., Feb. 20 — Cloudera, the leader in enterprise analytic data management powered by Apache Hadoop, today announced a formal alliance with Deloitte, a recognized leader in analytic services, to jointly enhance the ability of Deloitte clients to derive actionable insights from their data. The alliance will leverage the Cloudera enterprise data hub to accelerate customer time-to-value using Deloitte’s analytics services, industry-specific solutions and extensive portfolio of intellectual property accelerators. “For most organizations, the analysis of stored structured data is not new,” said Read more…