Open Data and Government Transparency

Government collects enormous amounts of data — about spending, property, public health, service delivery, compliance — and most of it stays locked up in formats that only an agency’s own systems can read, if it’s accessible at all. Open data is the practice of making that information actually usable: machine-readable, documented, free to download and build on.

I’ve been working on this problem in New Jersey for years, through OPRAmachine and through my work at DCA. Here’s what I’ve learned about what actually matters.

The gap between “released” and “useful”

The easiest form of open data compliance is publishing a PDF. It’s technically public, it’s theoretically accessible, and it’s nearly useless for anyone trying to do anything analytical with it. A PDF of a budget spreadsheet is not open data. A PDF of a meeting agenda is not open data. These count toward open data metrics and satisfy nobody.

The formats that matter are the ones you can do something with: CSV, JSON, GeoJSON, REST APIs. Structured data with consistent schemas and metadata that explains what each field means and when it was last updated.

Even then, data quality is the real problem. Inconsistent formats across agencies, missing records, information that’s months or years stale, no documentation on methodology — I’ve seen all of this in practice. Data quality is unglamorous and expensive to maintain, which is why it’s often where open data programs quietly fail.

What’s worth publishing

Not all data is equally valuable. The datasets that people actually use are the ones that connect to decisions they’re making: property records and permits, police incident reports, budget and spending data, contracts and vendors, election results, health inspection records. These are high-demand, high-impact, and usually politically uncomfortable — which is part of why they’re often the hardest to get published in useful form.

The agencies that have done this well tend to have someone internally who uses the data themselves and understands why format and freshness matter. The ones that do it badly tend to treat open data as a compliance checkbox.

OPRAmachine as an example

OPRAmachine has processed over 75,000 public records requests in New Jersey. One thing that became clear early on is that demand for government information is not the problem — people want this data, and they’re actively trying to get it. The friction is in the process. When you make requesting records easier and responses publicly visible, you don’t just help the person who filed the request; you create a searchable archive that helps everyone who comes after them.

That’s the compounding effect of open data done right. Transparency builds on itself.

For government officials

The practical starting point is a data inventory — what do you have, what format is it in, who else might want it. From there: prioritize by impact, not by ease. The data that’s easiest to publish is often the least useful. Start with what people are actually asking for, even if it requires more work to format correctly.

APIs over static downloads where you can. An API that stays current is worth more than a CSV that was accurate eighteen months ago.

Resources

Gavin Rozzi is a civic technologist and government digital transformation leader based in New Jersey. His work on open data includes creating OPRAmachine, which has facilitated 75,000+ public records requests. Explore his New Jersey work or book him as a speaker on open data and government transparency.

Back to all articles

The gap between “released” and “useful”

What’s worth publishing

OPRAmachine as an example

For government officials

Resources

About Gavin Rozzi

Related Articles

Media & Speaking