The Open Data movement is about many things – transparency, accountability, even democracy – but at one level, it’s also about value for money. City, state and national governments spend taxpayers’ money to pay for data collection, whether they’re conducting a countrywide census, launching weather satellites, or tracking the movements of city buses. By simple logic, Open Data advocates have argued that taxpayers should have free, open access to the data they’ve paid for, with exceptions made for data that needs to be protected for privacy or security reasons.
Over the last several years, the U.S. and UK governments, the European Union, many American cities and states, and about 60 countries in the Open Government Partnership have developed programs to release government data as Open Data. Too often, though, the data they release doesn’t meet the needs of the people who want to use it – like businesses and investors who analyze corporate and financial data, watchdog groups and journalists studying government safety data, or entrepreneurs using data to build Web and mobile apps. A core problem is that government agencies have taken what I’d call a supply-side approach to Open Data: They’ve chosen what data to release without much input from the people it’s supposed to serve.
As a result, Open Data is often far less useful than it could be. Agencies may release too little data, missing the key information that is most important to data-driven organizations and businesses. Or they may release too much in too confusing a way. Daniel Kaufmann of Revenue Watch has called this the problem of “Zombie Data” – data that exists without purpose or any real use. Regardless of quantity, government Open Data may also be released with gaps, errors, and inconsistencies that bedevil anyone trying to use it.
At the GovLab at NYU, I’m now leading the Open Data 500 to identify and study five hundred companies that use government Open Data for their business. One of the most important parts of our work is asking these companies exactly which government datasets they’re using and how useful they find them to be. Our early results show that those questions are hitting a nerve. The comments we’re receiving – many of which describe data issues in detail – range from high praise for government data to high frustration.
We hope the Open Data 500 will provide a basis for a much-needed dialogue between data providers and data users in the U.S. While this country has been a world leader in Open Data, we’ve been behind the curve in establishing this kind of critical feedback loop. Several countries are already building user engagement into their Open Data policies. For example:
- In the UK, the government set up an Open Data User Group early on to provide ongoing feedback, and often constructive criticism, on the usefulness of government data.
- The French government, which already provides Open Data through Data.gouv.fr, has set up a task force that will hold a series of “debates” with key stakeholders on health, housing, education, and other key kinds of Open Data. The government is also setting up an expert network to engage citizens, researchers, and civil society organizations in setting Open Data policy.
- At the Open Government Partnership conference, I learned how Mexico is now using a website they call a “datatron” to gather input on what Open Data the public would most like to see. Ania Calderon, who directs digital innovation for the office of the President of Mexico, told me that this polling system sparked a national online discussion within a week of launch.
- In a recent conversation, Bunmi Okunowu of Nigeria’s Ministry of Communication Technology told me that his country is planning a public forum to gather input on Open Data early in 2014.
The new Open Data Policy in the U.S., which was announced last May, explicitly calls for government agencies to get feedback from data users to improve their Open Data programs. That’s a welcome and important step. The policy also calls for agencies to release inventories of all the data they have so that people who want to use specific datasets can ask for them to be made available in especially useful ways. That strategy can work locally as well as nationally: The city of Chicago has become the first U.S. city to publish such an inventory,under the direction of former Chicago CTO John Tolva, who recently visited the GovLab.
The U.S. Open Data Policy will make effective user feedback more important than ever. The federal government maintains thousands of different information systems, structured in different ways. All that data can’t be turned into Open Data overnight. The only way to release more Open Data effectively, efficiently, and in a way that returns value to taxpayers, will be to prioritize the datasets that will provide the most public good.
We need what I’ve called a system of Demand-Driven Data Disclosure that engages Open Data’s stakeholders as shown in the diagram above: A new system where experts, businesses, public-interest groups, and other data users are key players in determining our national Open Data priorities. The Open Data Policy sets the stage, and our Open Data 500 project will provide initial findings for discussion. A good dialogue now can make Open Data more useful, valuable, and impactful in the United States.
- Joel Gurin, Founder and Editor, OpenDataNow.com