Five years ago, I released zipcodeR to solve a problem I kept running into: working with ZIP code data in R was harder than it should be. The package started as a tool for my own projects—analyzing public records requests, mapping service areas, connecting demographic data to geographic boundaries.
Today, zipcodeR has been cited in 53+ peer-reviewed publications across medicine, economics, environmental science, and information systems. It’s been downloaded over 115,000 times from CRAN, averaging roughly 1,500 downloads per month. And this month, it was cited in MIS Quarterly—one of the Financial Times Top 50 journals.
The citation growth tells its own story: 4 citations in 2022, 7 in 2023, then 18 each in 2024 and 2025. What started as a niche utility has become infrastructure for research I never anticipated.
A Sample of the Research
Looking through the full citation list on Google Scholar, the breadth is striking. Here’s a tour through what researchers are doing with zipcodeR.
Healthcare & Clinical Research
Nature Communications (2025): D’Antonio et al. built a multiethnic skin cancer risk prediction model using XGBoost, with zipcodeR providing the geographic data infrastructure for their analysis.
Cancer Discovery (2025): Farooq, Sharon, and Takebe analyzed demographic patterns across NCI-sponsored early-phase clinical trials from 2000 to 2023, using ZIP code data to understand who was—and wasn’t—being reached.
Surgery (2024): Finn et al. studied same-day versus overnight thyroidectomy delivery models, with ZIP code analysis informing their understanding of access patterns.
Journal of the American Heart Association (2024): Zaidi et al. examined the relationship between social determinants and continuity of care for patients with congenital heart disease, using ZIP code demographics to quantify care gaps.
Telemedicine & Healthcare Access
JAMIA Open (2024): Cummins et al. conducted a comparative effectiveness study that found telemedicine appointments have 64% higher odds of completion than in-person care appointments. They used zipcodeR to extract population density, land area, and median income for each ZIP code in their analysis.
Other telemedicine researchers have used the package to study pediatric teledermatology reach to underserved communities and mammography screening access in rural Kansas (using the zip_distance function to compute distances to screening facilities).
Public Health & Epidemiology
Substance Use & Misuse (2024): Deo et al. studied harm reduction drug policy effectiveness in Cuyahoga County, Ohio, using ZIP code data to map overdose prevention strategies.
Virus Evolution (2025): Veytsel et al. traced raccoon rabies molecular epidemiology across Connecticut, with geographic analysis at the ZIP code level.
ACS ES&T Water (2024): Rosengart et al. studied spatiotemporal variability in wastewater disease surveillance biomarkers, using ZIP code boundaries to structure their analysis.
Economics & Information Systems
MIS Quarterly (2026): This one deserves special attention. Rhue, Avery, and Clark conducted a large-scale randomized field experiment with nearly 160,000 donors on a major charitable crowdfunding platform. They tested whether geographic personalization—highlighting projects in the donor’s billing location—affects giving behavior.
The finding: local personalization increased engagement and donations overall, even among donors without prior evidence of home bias. But it also disproportionately directed funds toward affluent communities, reinforcing a “rich-get-richer” dynamic. zipcodeR provided the geographic matching infrastructure that made this experiment possible at scale.
European Economic Review (2023): Researchers analyzed COVID-19 behavioral response patterns using ZIP code-level demographic and economic data.
Ecological Economics (2025): Environmental valuation research used ZIP code data to connect economic analysis to geographic context.
Environmental & Urban Science
Social Networks (2025): Livas et al. mapped collaborative networks among environmental stewardship organizations in Baltimore, using ZIP code data to understand geographic patterns in urban environmental work.
And Even Paleontology
Palaeobiodiversity and Palaeoenvironments (2024): Smith, Rabenstein, and O’Keefe studied fossil preservation conditions at Palaeolake Messel. Yes, paleontology—demonstrating that when you build good geographic infrastructure, it finds uses you never imagined.
What These Citations Represent
A few patterns stand out across these 53+ publications:
Practical research questions. These aren’t theoretical papers. They’re studies on telemedicine access in rural areas, clinical trial representation gaps, health disparities in underserved communities, and environmental impacts on public health. The kind of research that informs policy.
Infrastructure role. Most citations use zipcodeR the way you’d use any infrastructure—it’s a component that makes the research possible, not the focus of the research itself. Researchers needed reliable ZIP code data, and zipcodeR provided it without external API dependencies or data licensing complications.
Unexpected reach. When I built a tool for working with ZIP codes, I didn’t anticipate it would end up in paleontology journals. But that’s the nature of good infrastructure—it finds uses beyond what you planned.
Building Tools That Last
For those of us working in government technology, civic tech, or public sector data science, there’s a lesson here. The tools we build for our own work can have reach far beyond our immediate context—if we make them available.
I built zipcodeR because I needed it. I open-sourced it because that’s how I think software should work. Five years later, it’s being used in cancer research, public health surveillance, and economic policy analysis. That’s not a path I planned, but it’s one I’m glad exists.
Resources
- zipcodeR on CRAN
- Documentation
- GitHub Repository
- Software Impacts Publication
- All Citations on Google Scholar
Gavin Rozzi is a data scientist and government technology executive based in New Jersey. In addition to zipcodeR, he has created several other open source tools including njgeo and njtr1 for working with New Jersey data. Explore his full portfolio or book him as a speaker.