Skip to content

Data Science & Analytics

Open-source tools cited in Nature, Cancer Discovery, and 47 other peer-reviewed publications

The Problem

Researchers spend hours wrestling with geographic data that should take minutes. ZIP code lookups require expensive APIs. Distance calculations need manual geocoding. Demographic overlays mean stitching together multiple data sources. Time that should go toward research goes toward data prep instead.

Gavin Rozzi builds open-source tools that eliminate this friction. His R package zipcodeR has been cited in 49 peer-reviewed publications—including Nature and Cancer Discovery—enabling breakthrough research in public health, epidemiology, and environmental science.

Technical Skills

R Programming

Package development, statistical analysis, data visualization with ggplot2, and Shiny applications.

Geospatial Analysis

GIS mapping, spatial statistics, boundary analysis, and geocoding with sf and leaflet.

Statistical Modeling

Regression analysis, time series, spatial autocorrelation, and predictive modeling.

Data Visualization

Interactive dashboards, publication-quality graphics, and data storytelling.

Data Science Projects

NJ HOMES Choice Tool

Interactive planning tool implementing A4/S50 affordable housing calculations for all 564 municipalities, supporting NJ HOMES grantmaking and municipal compliance.

affordable housingplanning toolcompliance

Operation Right Answer: Housing Services Modernization

Led development of a data-driven service framework to modernize how New Jersey housing programs serve residents, partnering with the New Jersey Innovation Authority to implement AI-assisted contact center technology.

Amazon ConnectGovernment InnovationCustomer Service

NJ Eviction Guide

Interactive self-help tool connecting New Jersey's most vulnerable residents directly to housing assistance and legal resources.

ReactTypeScriptPublic Service

Municipal Lead Reporting Portal

Statewide compliance platform capturing residential lead-paint inspection data for all 564 New Jersey municipalities.

Data CollectionComplianceGIS

New Jersey Population Density Map

Award-winning 3D visualization of New Jersey population density using rayshader, winning First Place in the 3D category at the NJ DEP GIS Mapmaking Contest.

data visualizationRGIS

njgeo

An R package for geocoding addresses using New Jersey's official geocoding service, freely available as an alternative to commercial solutions.

Ropen sourcegeospatial

njtr1

An R package that makes it easy to download and analyze New Jersey motor vehicle crash data for transportation research and safety analysis.

Ropen sourcetransportation

TrentonTracker

A modern Progressive Web App making New Jersey legislative data accessible and searchable, with ZIP code-based legislator lookup.

civic techtransparencyJavaScript

COVID-19 Spread Visualization

Interactive web-based visualization tracking the spread of COVID-19 across U.S. counties using advanced GIS technologies and real-time data processing.

GISdata visualizationpublic health

NJ Narcan Dashboard

An interactive dashboard tracking opioid overdose interventions across New Jersey through law enforcement Narcan deployment data.

public healthdata visualizationR

zipcodeR

An R package with 49 peer-reviewed citations including Nature and Cancer Discovery, enabling breakthrough research in public health, epidemiology, and environmental science.

Ropen sourcegeospatial

OPRAmachine

New Jersey's first statewide freedom of information platform, processing over 75,000 public records requests and releasing 250GB of government data.

civic techgovernment transparencypublic records

Publications

View all publications →

Articles & Tutorials

Frequently Asked Questions

What is zipcodeR and what can it do?

zipcodeR is an R package for working with U.S. ZIP code data, created by Gavin Rozzi. It provides functions for looking up ZIP code information, calculating distances between ZIP codes, finding ZIP codes within a radius, and accessing demographic data. The package is downloaded over 10,000 times monthly.

What programming languages does Gavin Rozzi use for data science?

Gavin Rozzi primarily uses R for data science, including package development, statistical analysis, and data visualization with ggplot2. He also works with SQL, Python, and various GIS tools for geospatial analysis.

What is geospatial analysis and how is it used?

Geospatial analysis involves working with geographic data to understand spatial patterns and relationships. It includes GIS mapping, spatial statistics, boundary analysis, and geocoding. Applications include demographic research, public health analysis, and government resource allocation.

How can data science improve government decision-making?

Data science improves government decision-making by providing evidence-based insights from public data. This includes analyzing demographic trends, measuring program effectiveness, optimizing resource allocation, and creating predictive models for policy planning.

What makes Gavin Rozzi's approach to data science unique?

Gavin Rozzi combines technical data science skills with deep domain expertise in government and public policy. He creates open-source tools used by researchers nationwide, publishes peer-reviewed research, and applies data science to solve real-world problems in civic technology.

About the Author

Gavin Rozzi

Gavin Rozzi

Gavin Rozzi is a civic technologist, data scientist, and digital transformation executive based in New Jersey. He leads technology initiatives at the NJ Department of Community Affairs and has created widely-used open-source tools including OPRAmachine and zipcodeR.