Tag: ETL

Embattled Redshift Gets Analytics Backing

Nov 22, 2016 |

Amazon Web Services’ Redshift data warehouse service got some much needed support this week with a partnership between a data management for analytics specialist and a tool developer aimed at helping RedShift users automate their analytics infrastructure. Read more…

Hortonworks Unveils New Offerings for AWS Marketplace

Nov 15, 2016 |

Hortonworks today took the wraps off new big data services that run on the Amazon Web Services (AWS) Marketplace. The Hadoop, Spark, and Hive services are pre-configured, and are designed to get users up and running quickly and easily. Read more…

Über File System from Alluxio Gaining Enterprise Traction

Oct 26, 2016 |

It took several years, but now we’re starting to see multi-hundred-node deployments of Alluxio, the distributed in-memory file system that was developed alongside Spark and Mesos at Cal Berkeley’s AMPlab. Read more…

Data Engineers in Hot Demand

Sep 27, 2016 |

The big data community has been dealing with the data scientist shortage ever since big data became a thing. Now we’re learning that there’s possibly an even bigger shortage of another type of data professional: Read more…

The Last Hadoop Data Management Tool You’ll Ever Buy?

Mar 21, 2016 |

The rise of big data has shaken up the data warehousing market, and one of the established vendors still looking to regain its footing is Informatica, which last year was taken private in a $5.3-billion leveraged buy-out. Read more…

See EBCDIC Run on Hadoop and Spark

Mar 3, 2016 |

Only 20,000 or so of the big beasts still exist in the wild. They’re IBM mainframes, and despite the scorn of a legacy label, they continue to run critical processes companies simply don’t trust to commodity Intel boxes. Read more…

Taming Unstructured Data with Cognitive Computing

Jan 15, 2016 |

Contending with unstructured data is no longer a priority reserved for the most well-financed, IT-savvy organizations, like Google and Facebook. As the world’s data continues to increase at nearly exponential rates, the reality is the majority of that data is unstructured and incongruent—in its native form—with time-honored tables and SQL-based modeling. Read more…

Five Steps to Fix the Data Feedback Loop and Rescue Analysis from ‘Bad’ Data

Aug 17, 2015 |

Despite enterprises’ best intentions in enforcing top-down standardization of data sets, non-compliant data can easily seep in and, through aggregations, transformations, and standardizations, spread throughout the organization. In a typical enterprise, inventory data from multiple regions and divisions across product lines could easily result in dozens of data sources being used for one analysis. Read more…

How Hadoop Solved BT’s Data Velocity Problem

May 8, 2015 |

Like most large corporations with millions of customers, BT (British Telecom) has an extensive collection of databases, and is constantly moving data in and out of them. But when data growth maxed out a critical ETL server, it found a solution in a distributed Hadoop system. Read more…