Tag: Spark

Google/ASF Tackle Big Computing Trade-Offs with Apache Beam 2.0

May 19, 2017 |

Trade-offs are a part of life, in personal matters as well as in computers. You typically cannot have something built quickly, built inexpensively, and built well. Pick two, as your grandfather would tell you. Read more…

Masking Technical Complexity in the Security Data Lake

May 2, 2017 |

Today’s growing cybersecurity threat demands a sophisticated response, one that increasingly involves the utilization of big data technologies like parallel file systems and machine learning. However, some security experts warn that growing number and complexity of big data security tools could be hindering the cause. Read more…

Iguazio Re-Architects the Stack for Continuous Analytics

Apr 18, 2017 |

When it comes to modern big data architectures, you will typically find lots of different components, engines, and moving parts, each of which tackles part of the problem. One vendor with bold vision of re-architecting the stack with a more streamlined approach is Iguazio, which is building a singular product based on Flash that delivers continues analytics on big and fast data. Read more…

Learning from Your Data: Essential Considerations

Apr 13, 2017 |

For any organization undergoing digital transformation, a primary consideration is how to find, capture, manage and analyze big data. They are looking to big data and data science to facilitate the discovery of analytics that will enable informed decision-making. Read more…

Hortonworks Touts Hive Speedup, ACID to Prevent ‘Dirty Reads’

Apr 4, 2017 |

If you’re considering using Hadoop for SQL-based analytics and BI, you’ll be interested in the latest news out of Hortonworks, which today unveiled a new release of its flagship data platform that boasts a fast new release of Apache Hive, as well as a new ACID merge function that can prevent “dirty reads.” Read more…

Meet Ray, the Real-Time Machine-Learning Replacement for Spark

Mar 28, 2017 |

Researchers at UC Berkeley’s RISELab have developed a new distributed framework designed to enable Python-based machine learning and deep learning workloads to execute in real-time with MPI-like power and granularity. Read more…

SAP Vora Gets Analytics, Cloud Upgrades

Mar 15, 2017 |

Building on its acquisition of Hadoop specialist Altiscale Inc., SAP is combining the latest release of its Vora in-memory distributed computing platform with its big data cloud as it extends the Apache Spark framework to deliver interactive analytics on Hadoop. Read more…

MapR Extends Its Platform to the Edge

Mar 14, 2017 |

MapR Technologies today unveiled MapR Edge, an extension of its converged data platform that lets customers install MapR nodes practically anywhere they want.

The new offering runs on small portable PCs like the Intel NUC (pictured above), and delivers the full breadth of MapR’s capabilities–including Hadoop, NoSQL, and data streaming functionality—anywhere customers want, from autonomous cars driving rural highways to wellheads in the oil field. Read more…

Hadoop Has Failed Us, Tech Experts Say

Mar 13, 2017 |

The Hadoop dream of unifying data and compute in a distributed manner has all but failed in a smoking heap of cost and complexity, according to technology experts and executives who spoke to Datanami. Read more…

2017 Is the Year of AI. Or Is It?

Mar 8, 2017 |

The media often likes to proclaim “The Year of This” or “The Year of That.” With the greater attention given to advancing capabilities in artificial intelligence and machine learning, it seemed like a no-brainer to declare 2017 “The Year of AI.” Read more…