Technologies » Frameworks


Kafka Creators Tackle Consistency Problem in Data Pipelines

May 24, 2016 |

One of the big questions surrounding the rise of real-time stream processing applications is consistency. When you have a distributed application involving thousands of data sources and data consumers, how can you be sure that the data going in one side comes out the other unchanged? That’s the challenge that Confluent is addressing with today’s launch of new software for Apache Kafka. If you’re moving big data today, you’re probably using Apache Kafka, or at least looking at it. The Read more…

How Spark and Hadoop Are Advancing Cancer Research

May 23, 2016 |

The combination of Spark and Hadoop has supercharged big data analysis across many industries and use cases by lowering the barrier of entry to advanced analytics and thereby enabling data scientists to create data-driven products that weren’t previously possible. But one area where Spark and Hadoop are having an especially strong impact revolves around cancer research. Cancer killed about 590,000 Americans last year, according to the Centers for Disease Control. That makes it the second leading causes of death in Read more…

Apache Foundation Keeps Eyes Wide Open with ODPi

May 20, 2016 |

If you’re looking for controversy in the Apache Hadoop community, you need look no further than the 2015 launch of the Open Data Platform Initiative (ODPi), which some perceived as an attempt to wrest control of Apache Hadoop from its open source roots. In fact, some Apache Software Foundation (ASF) leaders see potential good coming out of the ODPi, although there are valid concerns about negatives too. Jim Jagielski, a founding member of the ASF and a member of its Read more…

Hadoop 3 Poised to Boost Storage Capacity, Resilience with Erasure Coding

May 18, 2016 |

The next major version of Apache Hadoop could effectively double storage capacity while increasing data resiliency by 50 percent through the addition of erasure coding, according to a presentation at the Apache Big Data conference last week. Apache Hadoop version 3 is currently being developed by members of the Apache Hadoop team at the Apache Software Foundation. Akira Ajisaka, who is an Apache Hadoop committer and a PMC member, shared information about the next major release at last week’s Apache Read more…

Apache’s Wacky But Winning Recipe for Big Data Development

May 12, 2016 |

When Doug Cutting set out to develop an open source Web search engine in the late 1990s, he initially chose the GPL license to distribute his wares. When that failed, he decided to give the Apache Software Foundation a shot–and in the process may have changed the course of open source software development for the next 20 years. Cutting initially developed the Lucene search engine with the idea of building a business around it, but later decided to give the Read more…

News In Brief

Survey: Data Analytics Falls Short in Detecting Fraud

May 26, 2016 |

The growing list of technologies being used in the cat-and-mouse game of detecting financial fraud apparently does not include data analytics, according to new study. KPMG, the corporate consulting firm, reported this week that its global survey of about 750 “fraudster” investigations revealed that perpetrators actually hold the upper hand when it comes to leveraging technology. The survey found that 29 percent of the 110 fraudsters analyzed in North America and 24 percent of the survey’s universe of 750 were Read more…

Chorus Upgrade Shifts Machine Learning Emphasis

May 25, 2016 |

The latest version of Alpine Data’s analytics platform seeks to combine data with machine learning for business users by shifting the focus from algorithms while adding human collaboration and governance capabilities to machine learning projects. San Francisco-based Alpine Data said Wednesday (May 25) its Chorus 6.0 analytics platform targets a wider swathe of the business analytics spectrum by accelerating “the delivery of data into action and to create a clear, repeatable process for continuous business improvement.” Among the reasons that Read more…

Cray Bakes Big Data Software Framework Into Urika-GX Analytics Platform

May 24, 2016 |

Cray continued its courtship of the advanced scale enterprise market with today’s launch of the Urika-GX, a system that integrates Cray supercomputing technologies with an agile big data platform designed to run multiple analytics workloads concurrently. While Cray has pre-installed analytics software in previous systems, the new system takes pre-configuration to a new level with an open software framework designed to eliminate installation, integration and update headaches that stymie big data implementations. Available in the third quarter of this year, the Read more…

NASA Helps Launch Data Science Grad Program

May 23, 2016 |

Another U.S. university is adding a data science specialization to its curriculum, this one as part of an online Masters of Science degree in engineering. The University of California at Riverside said the data science track was developed in collaboration with NASA’s Jet Propulsion Laboratory (JPL) science staff. The partners said the data science program is aimed at engineers, scientists along with medical and social media professionals seeking to expand their training in data mining, data visualization, machine learning and Read more…

ThoughtSpot, Search Analytics Startup, Raises $50M

May 19, 2016 |

Citing the growing need for speed in accessing data for business intelligence applications, search-driven analytics specialist ThoughtSpot said it has more than doubled its equity investment total with the closing this week of a $50 million funding round. ThoughtSpot said Thursday (May 19) lead investor General Catalyst Partners was joined by Geodesic Capital. Existing investors include Lightspeed Ventures and Khosla Ventures. The startup based in Palo Alto, Calif., has so far raised more than $90 million. The company’s cofounders have Read more…

This Just In

Clarabridge Acquires Engagor

May 21, 2015 |

RESTON, VA., May 21 — Clarabridge, Inc., the leading provider of Customer Experience Management (CEM) solutions for the world’s top brands, today announced the acquisition of Engagor, the most comprehensive platform for real-time social customer service and engagement. The combined offering provides a complete, end-to-end technology solution for marketers, customer care organizations and operations teams to create more profitable customer relationships. Founded by Folke Lemaitre in 2011, Belgium-based Engagor offers a robust social listening and engagement platform for marketers and customer Read more…

MemSQL Launches Community Edition: World’s Fastest In-Memory Database Now Available to All

May 20, 2015 |

SAN FRANCISCO, CA – May 20  – MemSQL, the leader in real-time databases for transactions and analytics, today announced the most significant release of MemSQL to date. MemSQL 4 – which includes a new Community Edition – empowers interconnected enterprises to aggregate and report on real-time data, accelerating the growth trajectories of their digital businesses. This release brings to market groundbreaking capabilities such as the industry’s first real-time, distributed geospatial intelligence and the MemSQL Spark Connector to operationalize Apache Spark. Read more…

Apache Unveils Hadoop 2

Oct 17, 2013 |

Apache Software Foundation, which oversees the 150 or so open source projects under the famous Apache umbrella, this week announced Hadoop 2 – the latest version of the popular software framework for distributed computing.