About our Bay Area land use data sources

Municipal zoning defines much of the physical structure of our communities. This seemingly benign part of our local governance has become critical to our understanding of wealth and racial disparities in U.S. cities. Much of the current public debate and policy reform specifically centers on restricted single-family zoning, where development is limited to detached "single-family" homes at a mandatory low density. We have only recently started to see the extent to which single-family zoning dominates our urban residential land.

In 2020, the Othering and Belonging Institute released its San Francisco Bay Area Residential Zoning Dataset for public use, an accessible and regionally comprehensive dataset of single-family residential zoning in the metro-region.

This dataset includes shapefiles for all 100 municipalities and unincorporated areas of the nine Bay Area counties (Alameda, Contra Costa, Marin, Napa, San Francisco, San Mateo, Santa Clara, Solano, and Sonoma) with land designated as single-family residential, multi-family residential, or non-residential based on its zoning or best approximation as of March 2020. It is publicly available; all shapefiles and methodology are published on the Othering and Belonging Institute’s GitHub page.

As part of the Othering and Belonging Institute’s reporting on segregation in the Bay Area, we examined how the proportion of cities’ residential land restricted to single-family zoning relates to community characteristics, including race, income, education, environmental health, and measures of segregation. To do so, we needed a regional accounting of not only the land restricted to single-family zoning, but also the fraction of residential land that it accounts for. The result is the first aggregated, regional view of cities’ current residential land designations of the Bay Area based on zoning regulation.

We hope that this dataset helps other researchers overcome some of the challenges in residential zoning research in the Bay Area. Our simple categorization proved to be a complicated process of securing current data files, analyzing municipal code, and applying several decision heuristics. In pulling this dataset together, we worked through many of the same challenges as other housing and urban researchers face:

  • Zoning shapefiles are notoriously difficult to access. Zoning regulation is set at the municipal level; there is no centralized source for local zoning files at either the national or state level. While many municipal planning departments publish zoning code and maps, they rarely publish digital shapefiles that can be used for research. We collected zoning shapefiles through various sources, including the Association of Bay Area Governments, ESRI ArcGIS HUB, and city planning departments by request.

  • Zoning code varies by city. Cities use their own zoning definitions and naming conventions. Understanding what type of housing is allowed or whether a zone is residential often requires a close reading of municipal code. We referenced published municipal code, general plans, and specific plans in order to determine allowable land use.

  • Zoning categories do not always clearly define allowable use. Many cities include a ‘planned development’ or ‘specific plan’ zoning category into their code; these areas often have ambiguous land use regulation, providing flexibility for future zoning decisions or mixed use, enabling other policies to drive development, or relying on special community planning processes. Categorizing these areas requires some compromise. We chose to sort these areas into zoning categories based on existing land use, referring to land use plans, satellite imagery, and building footprints.

  • Zoning is not static. Cities change their zoning code on different timeframes. We can’t totally overcome the challenge this poses to zoning research. Our zoning categories provide a snapshot of regional zoning that was public as of March 2020.

Because of this, the San Francisco Bay Area Residential Zoning Dataset can’t capture all of the nuance of urban land use. Residential zoning regulation defines what type of housing is allowable, not what is ultimately built; therefore, these zoning categories may not reflect what a street actually looks like today.

However, it does provide a useful standardization of zoning code across the many Bay Area local governments. Some other larger scale data sources used for land use research rely on qualitative surveys of municipal planners rather than the zoning code itself, such as the Terner Center California Land Use Dataset, and the Wharton Residential Land Use Regulation Index. Commercial real estate data companies (such as CoreLogic, LandVision, and Zillow) sometimes include local zoning code labels at the parcel level, but zoning labels are not necessarily standardized across municipalities and the source or timeliness of zoning regulation are not disclosed, so analyzing an entire region may still require checking and interpreting local zoning code for some categories.

We think this zoning data could provide a useful lens for questions about our regional housing regulation, the spatial distribution of wealth and opportunity, and other social and demographic dynamics. Reach out to our research team if you have any questions about the dataset. If you end up using it for your own work, we’d love to hear about it.