Three Ways Open Data Could Make California Golden
While California is widely regarded as the high-technology hub of the world, the state lacks a cohesive open data policy. Last week, a non-profit think tank in the Golden State released a report detailing why the state government should adopt an open data policy.
According to the Milken Institute‘s “Open Data in California,” the state has the potential to unleash powerful economic forces if it follows the lead set by the Federal Government and 10 other states that have implemented open data policies over the past four years, including Texas, Oklahoma, New York, and Maryland.
“California is hardly a pioneer on the open-data frontier,” write the report’s authors, Jason Barrett and Kevin Klowden. “More than any other state, California needs a unified open-data strategy for at least three major reasons.”
Despite being the home of Web giants like Google and Yahoo, enterprise tech giants like Hewlett-Packard and Cisco, software megavendors like Oracle and Adobe, and dozens upon dozens of big data startups, there is no unifying strategy in Sacramento to effectively harvest today’s digital commodity: data.
Barrett and Klowden point out that California has more companies on the Open Data 500, which is a list of companies that make use of government data, than any other state; it has 132. Open data flag bearers include companies like Esri, the GIS software giant based in Redlands that utilizes data gathered by the United States Geological Survey (among other federal agencies); Crimespotting, a San Francisco organization that develops and app that lets people know about police activity in their neighborhoods; and OpenGov, a Mountain View company that develops data visualization software that transforms budgetary data from government clients into more readable charts and graphs.
Standardization and Transparency:
While Sacramento has not yet set a standard for how all of the state’s agencies should collect, store, and share data, that hasn’t stopped some of them, such as the Controller’s office, from adopting their own open data strategies in an ad hoc manner. The cities of San Francisco, San Jose, and San Diego have all launched their own open data portals that ranked highly in Milken Institute rankings.
Setting open data access standards would make it easier for journalists to track the actions of the government agencies they’re tasked with monitoring. Currently, the Fourth Estate labors under various local Sunshine Acts, which often results in “vague or outdated information about how their tax dollars are spent,” the Milken report says.
California has a poor reputation among business backers as being an overly regulated state that’s unfriendly to private enterprise. By providing a single reference point for agency cooperation and applicant communication, state officials can take a big step in battling that image, the authors write.
But the possibilities of open data extend far beyond permitting, the authors write. “Imagine if the Department of Transportation wanted to build a road through a wooded area with dense wildlife populations,” they say. “Architects could incorporate conservation efforts into their design by cross-referencing their plans with migratory patterns collected by the Department of Wildlife without having to submit official requests.”
If California did adopt an open data initiative, what should it look like? According to the authors, it should look a lot like New York’s. In fact, officials in that state created the New York State Open Data Handbook, which is a veritable “one stop shop” for how to set up an open data initiative.
The New York handbook “details best practices for executing an open-data policy—website development recommendations, data standardization, and guidelines for participating agencies, among others—and also serves as a resource to help policymakers ask the right questions when crafting their own policies,” the authors write.
However, California’s open data initiative would look different, the authors state. For starters, there would be high demand for data from the California Environmental Quality Act (CEQA), seismological data, and oil and gas data. A well-crafted open data policy would adequately anticipate public demand for this data, along with more traditional transparency-related data such as revenues and expenditure data, the authors write.
Besides New York, there are other guides that policymakers in California could use to usher the Golden State into the open data promised land. The Milken Institute points to the California Economic Summit Open Data SOAR (Streamline Our Agency Regulations) Team which described what an open data policy might look like in the state:
- Quality — For starters, the data should be high quality and vetted for accuracy whenever possible. (The New York data guidebook also provides numerous pointers for how data should be cleaned, the authors point out.)
- Security — The data should also be respectful of privacy and security concerns. That would mean there’s no personally identifiable information (PII) contained in the data.
- Well-Documented — Metadata should accompany the raw data whenever possible, providing a trail and a lineage of where that data came from.
- Up to Date — The data should be refreshed continually and on a regular basis. The authors cite Hawaii as a good model to follow here.
- Permanent — Public data would never die, but instead go into an ever-growing archive documenting the historical record.
- Searchable — All data should be searchable. That means no PDFs or image files, the authors say. CSV and JSON files would be good, though, they say.
Sounds great, right? But how much would this cost? Not as much as you think. Based on open-data labor statistics from New York, it would cost California just $4 million to $5 million to pay for the staffing required to develop, implement, and manage a fully functioning open-data policy, the authors state. And of course, there should be a chief data officer for the state, too.