Follow Datanami:
February 16, 2016

‘Stale,’ ‘Orphan’ Data Hogging Storage, Index Finds

Companies are spending a lot of dough to store and archive data while trying to stay within increasingly strict data governance rules. Nevertheless, a new report on the “composition of enterprise data” concludes that more than 40 percent of corporate data has not been looked at in three years.

In its Data Genomics Index released last week, data management vendor Veritas Technologies said its index considered data going from “potentially relevant to stale” in three years. “Incredibly, 41 percent of the average environment is stale, or unmodified in the past three years,” the company found. That works out to an estimated $20.5 million in additional data management costs, the company reckons.

Veritas, Mountain View, Calif., said its data index seeks to benchmark enterprise data environments as a way of reducing costs associated with managing data. In one example that contributes to the growing problem of “stale” data, the study noted an increase in “orphaned data.” Promotions or employee departures are creating large amounts of “data without an associated owner,” the index found.

“Orphan data tends to be content rich file types like videos, images and presentations—risky stuff to leave unattended,” the report warned. “It also is taking up more than its fair share of disk space based on file count distribution,” more than 200 percent.

The report also found that “orphaned information is disproportionately overweight and extra stale.”

Traditional office file types such as spreadsheets and presentations also were found to be storage hogs, prompting the study’s authors to recommend that enterprises prioritize archiving and deletion of stale data to free up storage space and reduce storage costs. The authors recommended focusing “remediation” efforts on formats like virtual machine file types to obtain the highest return on 1 Gb of storage per file. Security file types also were high on its list.

The ability to manage data and storage costs will only soar as data growth continues to surge. Data growth at the file level is estimated to be growing at a 39-percent clip annually. Moreover, the study found that average file size has ballooned from 0.24 MB a decade ago to 0.53 MB for files modified in the past year.

While storage capacity requirements are growing, so too is the need for storage and data management, Veritas asserts. “The storage environment is cluttered, where the average [petabyte] of information contains” more than 2.3 billion files.

Steady corporate data growth is forcing enterprises to prioritize what they hang onto, what to archive and what files to delete. Systems archiving of spreadsheets, presentations, documents and files that account for about 20 percent of “stale” data could help cut storage costs in half, the index found. That translates in savings of about $2 million annually.

Veritas said its Data Genomic Project is designed to promote “better understanding [of] the true nature of the unstructured data that we are creating, storing, and managing on a daily basis.” The company said its index compiled last was based on analysis of tens of billions of files and their attributes from customers’ unstructured data environments. More than 8,000 file type extensions were considered.

Recent items:

IBM Rolls Power-based ‘Data Engine’

Tape Storage Spec Jumps to 15 TB