April 4, 2019

Intel Builds Analytics, Database Use Cases for Optane

George Leopold

Intel offered a list of use cases for its Optane DC persistent memory technology during a company event this week, including Twitter’s effort to scale its Hadoop clusters using Optane and SAP HANA’s database improvements focused on applications like machine learning and predictive analytics.

The chip maker (NASDAQ: INTC) highlighted those and other datacenter implementations to bolster its assertion that persistent memory is a game-changer. “We’re going to break through memory economics bottlenecks,” said Navin Shenoy, general manager of Intel’s Data Center Group. Noting that DRAM currently accounts for up to 60 percent of total system cost, Shenoy added: “Optane is going to change the economics of memory.”

Optane is promoted as addressing two major shortfalls in the era of big data: volatile memory’s inability to retain stored data when power is lost and NAND’s failure to keep pace with “data-centric” analytics applications.

Optane addresses those stumbling blocks with brute force along with clever software and firmware designs, most notably, a 512-Gb memory module that is up to four times larger than current DRAM capacity.

For early adopters like SAP (NYSE: SAP), the practical effect of the transition to persistent memory technology is new HANA database capabilities centered on improved performance, disaster recovery and consolidation of multiple database systems into a single platform.

Dirk Basenach, SAP’s database chief, said the rollout of persistent memory technology on the company’s flagship HANA database has so far yielded reductions in data loading times from 50 to 4 minutes on a 6-terabyte S/4 HANA system. The database vendor also said the memory technology has helped it achieve 9.1 billion records on a single HANA database.

SAP expects to expand the use of Optane persistent memory in an upcoming release of the HANA data hub, Basenach said.

For its part, Twitter (NYSE: TWTR) relies heavily on Hadoop clusters to store posts, “Likes” and retweets. “Hadoop is an important part of storing all those events and performing analytics on that data,” said Matt Singer, Twitter’s senior staff hardware engineer.

A typical Hadoop cluster used by Twitter includes more than 100,000 hard-disk drives, translating into about 100 petabytes of storage in a cluster.

Affordable hard drives have become the “work horses” of Twitter’s Hadoop clusters, Singer noted. “Hard-drive capacity has increased over time, but their I/O [per second] has remained essentially flat, and that has resulted in a storage bottleneck.”

Hence, Twitter leveraged persistent memory and data caching to overcome the bottleneck caused by simultaneous HDFS and YARN data flows “contending” for access to hard drives on a Hadoop server.

Source: Twitter

The solution was selective caching of YARN temporary data via “smart” caching software integrated with an Intel “fast” SSD to eliminate competition for CPU resources

“So hard-drive utilization dropped, and Hadoop could process data faster,” Singer said.

“Removing that storage I/O bottleneck enables us to greatly reduce the number of racks in our [Hadoop] cluster, which gives us a much smaller datacenter,” he continued. Twitter’s Hadoop cluster shrunk from 12 smaller to eight larger hard-drives without impacting performance. Removing the hard-drive performance bottleneck also meant Twitter could leverage CPU horsepower, upgrading from four to 24 Intel second-generation Intel Xeon Scalable cores.

The size of the Hadoop cluster was reduced, resulting in a much smaller datacenter, 75 percent less energy usage and 50-percent faster run times, Singer added.

Intel also this week announced an Optane-based datacenter SSD with dual-port capability targeting enterprise storage along with an EDSFF (Enterprise and Datacenter SSD Form Factor)-compliant QLC NAND SSD with 1 petabyte of storage in a “ruler” form factor.

That, the chip maker claimed, would yield a 20-fold storage rack consolidation versus hard drives.

Recent items:

Aerospike Adds Optane Persistent Mem ory

Redis Speeds Towards a Multi-Model Future

Applications: Artificial Intelligence, Predictive Analytics

Technologies: Frameworks, Processors, Storage

Sectors: Biosciences, Financial Services, Healthcare, Manufacturing, Other, Retail

Vendors: intel, SAP, Twitter

Tags: database, Hadoop, HANA, HDFS, i/o, Optane DC, Optane DC persistent memory, persistent data, predictive analyitcs, smart caching, storage, yarn

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Intel Builds Analytics, Database Use Cases for Optane

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 25, 2024

April 24, 2024

April 23, 2024

April 22, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Intel Builds Analytics, Database Use Cases for Optane

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 25, 2024

April 24, 2024

April 23, 2024

April 22, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link