5 Critical Steps for Identifying the Value in Your Unstructured Information
I have spent many years helping organizations gain control and management over their unstructured data or information. The days where we looked for one content management solution to capture and manage our valued unstructured data (information) are long gone.
Organizations are now faced with the challenge of implementing multiple content management solutions focused on many tiers within the organization:
- Local business unit applications – these applications manage data and content localized to a single business unit
- Cross business unit applications – these applications manage data and content that is used by multiple business units
- Enterprise applications – these applications manage data and content that is use by the enterprise
Implementing a diverse set of content management solutions requires a consistent approach that reuses a proven methodology. All successful content management implementations address these basic five steps:
- Discover– how can we discover and categorize the valuable data contained in the file shares and many repositories?
- Organize– how can we organize information so the business users can identify, manage, and control it?
- Govern– what are the policies required to make the data available, accessible and reliable?
- Manage– how do we manage the data lifecycle to meet compliance/ retention requirements?
- Analyze – how can users access the data and turn it into assets that can be used to make business decisions?
Unstructured data or information is routinely found in many repositories, including email, file shares, Google drive, Dropbox and SharePoint, to name just a few. The ever-growing number of files has resulted in a problem I call digital hoarding. In many cases, digital hoarding has been caused by mismanaged information from the very beginning while in other cases the availability of cheap storage has quickly led to the out of control proliferation of various repositories.
Manually reviewing and organizing the vast quantity of information that we’ve stored is often a daunting and impossible task. It is critical in the cleanup process that we identify and separate the files that require management from those that can be either deleted or just left alone. Utilizing an automated tool that scans and groups your large amount information is the best approach to accomplishing this time-consuming task. A good example is their ability to differentiate files that contain PII information from those that are related to contracts. Understanding the universe of files and their associated groupings is a critical task when designing and implementing a content management solution.
Once we have discovered and sorted the digital information into stacks that I call file groupings, we now need to add metadata that will make the information more valuable to the organization. This step in the process involves the organization of the information into document types that align with the organization’s business structure or taxonomy. It is critical that the end-users understand how to access the information. End-users want to work in a familiar environment and should not be forced to think differently when trying to access information.
Developing the document types and metadata framework for each document type is often a long and tedious effort. Associating information to document types in a predefined taxonomy simplifies, if needed, the assignment of metadata. The maturity of automated tools, e.g. AI and Machine Learning, has led to better capabilities in identifying and assigning metadata resulting in higher accuracy and quality of the information that describes the content.
Once you have your unstructured data organized, the next step will be to develop a governance program that defines and enforces security, consistency and retention policies. Information governance ensures that the unstructured data is available, accessible and reliable when needed for analysis.
Timely access to the right information is critical when making strategic decisions. Too many times the wrong information is used in making decisions or communicating information to interested parties. Applying consistent policies to information provides uses with the assurance that they can the access and trust of the information.
An effective information governance program not only will ensure that you have reliable information but will also ensure that you have the implemented the right policies to keep your information protected and your executives out of jail.
The management step in the methodology refers to the information lifecycle that spans the creation to final deletion of information. The lifecycle phases cover creation, revision, approval, promotion, retention, and destruction.
Management will establish the information, security, and retention architecture. Defining an effective information architecture will ensure the unstructured data is secure and meets compliance and records management requirements.
Effective organization and management of your information creates an environment that provides your users better access and controls over their information. Applying retention policies to information will help users follow the rules for retaining and ultimately deleting information within a defined timeframe.
A good rule of thumb is that if there is no compliance/ regulatory need to manage the information or you never need to go back to the content for analysis/ reporting, there is no need to keep the information.
The final step in the methodology is the ability to access and use the information for analysis and reporting. Having a well-defined information architecture enables fast reliable access to the content. New content analytics tools have emerged that help uncover value buried in the information.
About the Author: Alan Weintraub is a senior information management leader and evangelist at DocAuthority. As an AIIM Fellow, he is focused on helping organizations maximize the value of their information. A former industry analyst at Forrester and Gartner, Alan is a recognized expert on multiple aspects of enterprise information management (EIM) including information governance (both data and content governance), enterprise content management, data management, digital rights management, and digital asset management. Get in touch with Alan on LinkedIn and Twitter.
September 23, 2021
- CNCF End User Technology Radar Provides Insights into DevSecOps
- At Annual OCEANS 2021, Sofar Ocean Debuts First-of-Its-Kind Maritime Open Standard, Bristlemouth
- Elastic Announces the General Availability of Elastic App Search Web Crawler, New Features for Elastic Enterprise Search
- Securonix Achieves FedRAMP In-Process Authorization
- EDJX and Cubic Corporation Partner to Launch the Internet of Military Things Edge Platform
September 22, 2021
- GigaOm Names Moogsoft an Industry Leader in “Radar for AIOps Solutions” Report
- Clearsense Acquires Plug-and-Play AI Analytics Firm
- Purdue University Global Launches Master of Science in Data Analytics
- Dihuni OptiReady CognitX Deep Learning Servers and Workstations Powered by NVIDIA Ampere Architecture-based GPUs
- Scality Awarded New U.S. Patent for Breakthrough Technology in Hyper-Scale Data Protection
- MicroAI to Bring AI Training to Renesas MCUs
- Recent Gartner VP Analyst Sanjeev Mohan Joins Okera as a Strategic Advisor
- C3 AI Reinvents Enterprise Software UX With C3 AI Data Vision
September 21, 2021
- Healthcare Analytics Summit 21 Virtual Kicks Off Today
- Tesco Selects Teradata Vantage to Drive Enterprise-Wide Analytics at Scale
- Ketch Secures $20 Million in Series A1 Funding, Accelerating its Rapid Growth
- Yandex Spins Off ClickHouse into Standalone Company
- Analytics Vidhya Announces $5.5 Million Strategic Investment from Fractal, Aims to Train Half a Million Full Stack AI Professionals
- Nutanix Cloud Platform Breaks Down Silos in Hybrid Multicloud Operations
- Telit Announces New Industrial IoT Platform To Visualize Machine Data
Most Read Features
- One on One with Google Cloud Product Director Irina Farooq
- Big Data File Formats Demystified
- Tabular Seeks to Remake Cloud Data Lakes in Iceberg’s Image
- What’s the Difference Between AI, ML, Deep Learning, and Active Learning?
- Who’s Winning In the $17B AIOps and Observability Market
- SambaNova Brings Custom Silicon To Bear on High-End AI Workloads
- In Search of the Modern Data Stack
- COVID-Driven Cloud Surge Takes a Toll on the Data
- Rethinking Education in an AI-First World
- Did Rockset Just Solve Real-Time Analytics?
- More Features…
Most Read News In Brief
- LinkedIn Open Sources Tech Behind 10,000-Node Hadoop Cluster
- Data and AI Salaries Continue Upward March, O’Reilly Says
- Gartner Shuffles the Technology Deck with Latest ‘Hype Cycle’ Report
- Data Prep Still Dominates Data Scientists’ Time, Survey Finds
- Who’s Winning in Open Source Data Tech
- Can Apple Right its Privacy and Security Cart?
- Apollo CEO Bullish on GraphQL’s Potential in the Enterprise
- Hands-Off: Manual Data Integration Tasks Plummeting, Gartner Says
- Why Is SAS Going Public?
- Unstructured Data Growth Wearing Holes in IT Budgets
- More News In Brief…
Most Read This Just In
- TIBCO NOW 2021 Showcases Limitless Power of Data
- Cribl Raises $200M in Series C Funding on Traction with Global Enterprise Customers
- Toloka Launches Data Research Grants, Announces First Eight Recipients
- Anaconda Announces Support for Pyston, Hiring Lead Developers Kevin Modzelewski and Marius Wachtler
- MariaDB Announces SIS Provider Campus Cloud Services Migration to MariaDB SkySQL
- Transaction Processing Performance Council (TPC) Launches an Artificial Intelligence Benchmark (TPCx-AI)
- Kinetica Fuses Streaming and Contextual Analysis At Scale
- OneStream Previews New AI and ML Capabilities at Splash 2021
- JetBrains Launches Public Early-Access Program for JetBrains DataSpell IDE
- Aporia Launches Self-Serve Machine Learning Platform Open to Public
- More This Just In…
Sponsored Partner Content
October 5 - October 7
October 12 - October 14
October 19London United Kingdom
October 27 - October 28
November 29 - December 3
December 6 - December 10San Diego CA United States