Five years ago, I released zipcodeR to solve a problem I kept running into: working with ZIP code data in R was harder than it should be. The package started as a tool for my own projects—analyzing public records requests, mapping service areas, connecting demographic data to geographic boundaries.
Today, zipcodeR has been cited in 53+ peer-reviewed publications across medicine, economics, environmental science, and information systems. It’s been downloaded over 115,000 times from CRAN, averaging roughly 1,500 downloads per month. And this month, it was cited in MIS Quarterly—one of the Financial Times Top 50 journals.
From Utility Package to Research Infrastructure
The growth caught me off guard. What started as a solution for my own workflow is now enabling research I never anticipated.
In healthcare, researchers have used zipcodeR to study telemedicine access patterns, pediatric teledermatology reach to underserved communities, and cardiovascular care disparities. Environmental scientists have applied it to air pollution studies and wastewater disease surveillance. Economists have used it for everything from COVID-19 behavioral response analysis to environmental valuation research.
The MIS Quarterly citation is from a field experiment on charitable crowdfunding—researchers needed ZIP code infrastructure to test whether geographic personalization affects donor behavior. They found it does, and zipcodeR provided the geographic matching they needed at scale.
What These Citations Represent
When I look at the full citation list on Google Scholar, a few things stand out:
Range. The citations span Nature, Cancer Discovery, European Economic Review, Ecological Economics, Journal of the American Heart Association, and dozens of others. Researchers across disciplines needed the same thing: a reliable way to work with ZIP code data.
Practical research questions. These aren’t theoretical papers. They’re studies on telemedicine access in rural areas, health disparities in underserved communities, environmental impacts on public health. The kind of research that informs policy.
Infrastructure role. Most citations use zipcodeR the way you’d use any infrastructure—it’s a component that makes the research possible, not the focus of the research itself. That’s exactly what I wanted when I built it.
Building Tools That Last
For those of us working in government technology, civic tech, or public sector data science, there’s a lesson here. The tools we build for our own work can have reach far beyond our immediate context—if we make them available.
I built zipcodeR because I needed it. I open-sourced it because that’s how I think software should work. Five years later, it’s being used in cancer research, public health surveillance, and economic policy analysis. That’s not a path I planned, but it’s one I’m glad exists.