Is the new open data directive transformative, or will bureaucratic inertia win out?
Making federal government data available to the public in useful formats is a daunting knowledge management challenge. But federal CIO Steve VanRoekel and CTO Todd Park remain steadfast in their efforts to make open and interoperable the default setting for federal data. It will require a cultural as well as IT procurement transformation.
In May 2013, the White House released an executive order titled "Making Open and Machine Readable the New Default for Government Information." The order was accompanied by a new open data policy document from the Office of Management and Budget and an online repository of tools and case studies called Project Open Data.
President Obama's order states that "government information shall be managed as an asset throughout its life cycle to promote interoperability and openness, and, wherever possible and legally permissible, to ensure that data are released to the public in ways that make the data easy to find, accessible and usable."
The policy requires federal agencies to create an enterprise data inventory and a public listing of that inventory, and to make data sets available to the public whenever possible.
People whose work focuses on government transparency responded enthusiastically. The consensus seems to be that this could bring a sea change in how the public accesses government information if agencies are required to comply. The White House released positive comments from several open government advocates. "For too long, valuable public information has been locked away in file cabinets and poorly designed IT systems," said Sean Moulton, director of Open Government Policy at the Center for Effective Government. "Today's policy points a new way forward and takes concrete steps to make public information open by default."
The compliance issue
But whether the new policy document is transformative or not depends on the implementation, a huge KM challenge that requires both the will and resources to make it happen at the agency level.
"Looking at it broadly, this is a big deal," says Daniel Schuman, policy director for Citizens for Responsibility and Ethics in Washington (CREW). "However, there is a lot of wiggle room in terms of compliance."
Particularly, observers will be watching how agencies determine which data sets to make public and which to restrict access to.
"An agency might have an Excel spreadsheet with one field that has a ZIP code or Social Security number and it could use that as an excuse to not disclose data sets," Schuman says. Other agencies and their staff members may decide that they are going to be around longer than this administration and just slow-walk the effort and wait them out. Theoretically, OMB and the White House control agency budgets and can penalize them for not cooperating, Schuman says, but "on a practical level, there is not a whole lot they can do."
Amy Bennett, assistant director at Open the Government, a coalition of organizations, notes that this is not the first time agencies have been required to create data inventories. The administration's earlier open government directive required that, too. This provides additional guidance. "It just shows that it is really hard to do," she says. "They have so many business units that creating the catalogs is challenging."
"It is not that people are opposed to complying, but if it isn't part of their core mission, they tend to put it on the back burner because they don't see the value of doing it," says Michael Daconta, VP of advanced technology at InCadence Strategic Solutions and the former metadata program manager for the U.S. Department of Homeland Security (dhs.gov). "That is why the visibility of the program is important and how much pressure OMB and the White House puts on agencies to make it a priority."
Resources for the job
Since 2005 or 2006, he adds, large federal agencies have been working on metadata catalogs, but different agencies are at different levels of maturity. "Something like this executive order puts it in the limelight for a while, but smaller agencies may be at lower levels of maturity and when budget cuts hit, often these positions are among the first to be cut," Daconta says. "Sometimes they will have a project team of five to seven people whittled down to just one. So the interest and emphasis waxes and wanes."
Noel Dickover, an independent consultant working on technology and development projects within the federal government, wrote on his blog that creating and maintaining an enterprise data inventory is a massive undertaking. "It has been pointed out that to effectively execute this policy, an agency must devote real resources to create an enterprise data repository, start the public data listing, create a process for engaging with customers, develop communication strategies, engage with entrepreneurs and innovators in the private and nonprofit sectors, all while ensuring privacy and national security interests are protected. All of this during an era of sequester with no new funding for this task," Dickover wrote.
Daconta agrees that what OMB is asking is a heavy lift for agencies. Government is good at creating councils and committees and holding meetings to study how to get at this data, he says, adding, "But the question is, who is at that meeting: someone with the authority to get business sub-units in line or just a placeholder with no authority?"
Useful goals
Departments will start to work on data asset inventories and public data listings, as the executive order requires, but what is tougher is measuring the completeness of those data sets. Daconta says, "In some large agencies, they may not know all the data sets they have, and getting business units to share can be difficult."
CREW's Schuman says agencies don't have sound reasons for not creating data inventories. "Putting aside the benefits to the public, I would argue that if they don't have a data inventory, they aren't serving the needs of the agency. If you don't know what you have, you may be duplicating efforts and wasting resources,"
Another challenge is measuring the impact of making data public. What would you measure and how? The Office of Management and Budget is creating a cross-agency team to look at performance goals. "It can't just be how many data sets are made public," Bennett says. "Some agencies broke it down by year to make the number of data sets sound impressive but not any more helpful to the public. They will study how to set up performance goals that are actually useful."
Bennett notes that among the agencies that have made the most progress so far are Health & Human Services, Transportation, the General Services Administration and Education. "Some agencies are just used to being outward facing and used to working with the public," she says.
Change at HHS
Bryan Sivak, CTO at the Department of Health & Human Services, has made it his focus to leverage the department's underutilized assets, and the first one he focused on is data. For the last several years, Sivak and his predecessor Todd Park, now the federal CTO, looked at all the data HHS has or curates or collects with an eye on making it available to people outside government. An effort started three years ago to give 45 private-sector programmers access to 10 HHS data sets led to the development of 21 different applications, Sivak said at a recent Converge health IT summit meeting in Philadelphia. That effort has ballooned into an annual Health Datapalooza conference in Washington that draws more than 2,000 people. You can read more about those efforts at HealthData.gov.
Sivak referenced the presidential executive order mandating open data. "This movement has visibility and drive from the highest levels of government," he said. "We have been successful in liberating data that has been locked up in silos." The Centers for Medicare and Medicaid Services (CMS) has been protective of data traditionally and there are still some recalcitrant holdouts. But, he added, HHS is releasing new data sets all the time, such as hospital pricing data, which had 200,000 downloads in the first week.
In an attempt to help outside developers take better advantage of the data, HHS is partnering with Code Academy (), which will hold online classes focused on how to use HHS data sets. "We have changed the default setting in HHS," he said, "so that people think of releasing data as what they should do."