October 8, 2012

Helping the Government Survive the Data Tsunami

Ian Armas Foster

If all the data that exists in the world were represented as 2-hour high definition movies, it would take a human 47 million years to watch all of them. These ridiculous statements emphasize the flood of data that is engulfing the world. Quickly, government agencies are going to have to figure out how to deal with it comprehensively.

A report from TechAmerica Foundation’s Federal Big Data Commission is designed to help identify for the government agencies the points of emphasis in the evolving big data world and suggest a plan of action. The commission is headed by Steve Mills of IBM, Steve Lucas of SAP, and Michael Rappa of NC State and includes various people from industry-leading vendors such as Cloudera, NetApp, Amazon, and EMC among others.

The report emphasizes and re-emphasizes two things: education as a future driver of big data advancement and the guiding of agencies toward viable solutions to deal with the data overload.

What is frequently lost in the national job creation debate is exactly which areas of the economy can spawn the highest amount of jobs. Currently, technology is one of the areas where the manpower resources do not sufficiently meet the industry’s demand, especially with regard to managing big data. The point is that there are plenty of data scientist jobs to be had if enough is invested in training and educating.

Healthcare is a hot button topic in this country due to its rising cost and apparent inefficiency. “Big data can help with that,” the report optimistically states. It should be no surprise that the digitalization of health records in the country has left the healthcare industry awash in data. According to the report, the industry produced 150 exabytes of data in 2009. It is safe to assume that number has increased significantly over the last three years.

The use cases of improving the efficiency of government agencies through big data are seemingly endless. From education to transportation to energy, big data can eventually be applied to pretty much anything. What is more interesting is exactly what is being done to transform the data into insight.

While representatives from vendors were included in the formation of this report, the large sample size ensures a relative lack of bias. As such, Hadoop was identified as a good research-oriented tool that leaves a little to be desired regarding real-time analysis and streaming. According to the report, “Hadoop is good for finding a needle in the haystack among data that may or may not be “core data” for an agency today. Hadoop and Hadoop-like technologies tend to work with a batch paradigm that is good for many workloads, but is frequently not sufficient for streaming analysis or fast interactive queries.”

The report notes some “big data accelerators” such as text extraction tools that can help with the quicker or more variable demands.

Interestingly, the report recommends approaching the three V’s as entry points. “Some initiatives do indeed leverage a combination of these entry points, but experience shows these are the exception.”

This means taking a divide-and-conquer approach within the agency to attack each individual use case. For example, a use case requiring real-time streaming and decision making would want to focus on velocity, while perhaps a more research-intensive, time-independent use case can focus on greater variety or volume.

The report is lengthy, but it succeeds in providing for the government some viable big data guidelines going forward.

Applications: Data Mining, Research Analytics, Visualization

Technologies: Cloud, Network, Storage, Systems

Sectors: Academia, Government

Vendors: IBM

Tags: amazon, big data, cloudera, data science, emc, IBM, netapp, SAP, TechAmerica, US Government

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Helping the Government Survive the Data Tsunami

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Helping the Government Survive the Data Tsunami

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link