Follow Datanami:

Tag: HDFS

AWS Delivers ‘Lightning’ Fast LLM Checkpointing for PyTorch

AWS customers who are training large language models (LLMs) will be able to complete their model checkpoints up to 40% faster thanks to improvements AWS has made with its Amazon S3 PyTorch Lightning Connector. The compan Read more…

Inside AWS’s Plans to Make S3 Faster and Better

As far as big data storage goes, Amazon S3 has won the war. Even among storage vendors whose initials are not A.W.S., S3 is the defacto standard for storing lots of data. But AWS isn’t resting on its laurels with S3, a Read more…

How Acceldata Helped T-Mobile’s Data Modernization Strategy

When T-Mobile started migrating some of its data estate from an on-prem Hadoop system to cloud-based data platforms, it found the move liberating. But as it settled into a hybrid-cloud world, T-Mobile realized costs were Read more…

Alluxio Nabs $50M, Preps for Growth in Data Orchestration

Data orchestration software provider Alluxio today announced the close of an oversubscribed $50-million Series C round, which its CEO plans to spend on a global expansion. It also launched version 2.7 of its software, wh Read more…

How Facebook Accelerates SQL at Extreme Scale

Serving SQL queries on a petabyte of data is one thing, but delivering it at Facebook’s scale is something else entirely. Earlier this year, the social media giant implemented the Alluxio distributed file system into i Read more…

LinkedIn Open Sources Kube2Hadoop

Hadoop and Kubernetes have fundamentally different ways of authenticating users, exposing a security gap for organizations that want to access HDFS data from Kubernetes-based applications. Thanks to the new Kube2Hadoop t Read more…

Rob Bearden Returns to Lead Cloudera’s Second Act

When Cloudera ran into trouble last June following poor financial results, the board jettisoned senior leadership, including CEO Tom Reilly and Mike Olson, its chief strategy officer. Those moves would open up a path for Read more…

Big Data Predictions: What 2020 Will Bring

With just over a week left on the 2019 calendar, it’s now time for predictions. We’ll run several stories featuring the 2020 predictions of industry experts and observers in the field. It all starts today with what i Read more…

2019: A Big Data Year in Review – Part One

At the beginning of the year, we set out 10 big data trends to watch in 2019. We correctly called some of what unfolded, including a renewed focus on data management and continued rise of Kubernetes (that wasn’t hard t Read more…

Cloudera Begins New Cloud Era with CDP Launch

Eleven years after its founding, Cloudera fulfilled its name in a big way today with the launch of Cloudera Data Platform (CDP), its new flagship data platform that allows customers to securely manage and govern their da Read more…

Seeing the Big Picture on Big Data Market Shift

Hidden from view in the "I want to be data-driven" conversation are the nitty-gritty details of how actually to become a data-driven organization. The grand hope is that artificial intelligence, in the guise of machine l Read more…

Re-Imagining Big Data in a Post-Hadoop World

In the big data battle for architectural supremacy, the cloud is clearly winning and Hadoop is clearly losing. Customers are shying away from investing in monolithic Hadoop clusters in favor of more nimble (if not less e Read more…

Databricks Donates Delta Code to Open Source

Databricks today announced that it's open sourcing the code behind Databricks Delta, the Apache Spark-based product it designed to help keep data neat and clean as it flows from sources into its cloud-based analytics env Read more…

Intel Builds Analytics, Database Use Cases for Optane

Intel offered a list of use cases for its Optane DC persistent memory technology during a company event this week, including Twitter’s effort to scale its Hadoop clusters using Optane and SAP HANA’s database improvem Read more…

Can On-Prem S3 Compete with HDFS for Analytic Workloads?

In the battle for big data storage supremacy, Hadoop is still in the running. It may no longer be the 800-lb gorilla, but the demonstrated scalability of the Hadoop Distributed File System (HDFS) makes it a potent conten Read more…

Hadoop Gets Improved Hooks to Cloud, Deep Learning

Organizations that adopt the latest version 3.2 release of Apache Hadoop will get new integration hooks into the AWS and Azure clouds, as well as access to a new deep learning project called Hadoop Submarine. Hadoop m Read more…

Hadoop 3.0 Likely to Arrive Before Christmas

It's looking like big data developers will get an early holiday present as work on Hadoop version 3.0 nears completion. And while Hadoop 3.0 brings compelling new features, including a 50% increase in capacity and upward Read more…

Committers Talk Hadoop 3 at Apache Big Data

The upcoming delivery of Apache Hadoop 3 later this year will bring big changes to how customers store and process data on clusters. Here at the annual Apache Big Data show in Miami, Florida, a pair of Hadoop project com Read more…

ODPi Tackles Hive with Latest Hadoop Runtime Spec

ODPi today unveiled the second major release of its Runtime Specification that's geared at setting a standard for Hadoop components to ensure greater interoperability among distributions and third-party products. New add Read more…

Investments in Fast Data Analytics Surge

Companies are quickly ramping up their investments fast data analytics and real-time stream processing frameworks and lowering spending on batch technologies in an attempt to get on top of growing data volumes and veloci Read more…

Datanami