Assembling Census Summary Data that is Geographically Consistent Over Time

Relevant indicators (city and metro area data only):

People of color
Race/ethnicity
Population growth
Racial generation gap
Diversity index

One unique feature of the Atlas is that all data presented are geographically consistent over time—meaning that the data for any geography are assembled to reflect the same geographic boundary in all years reported. While data from the decennial census summary files for 1980 through 2010 (and subsequent ACS summary files) are already geographically consistent for all states and for the U.S. as a whole, the same is not true for regions (i.e., metropolitan areas) and cities. As noted above, the definition of regions used in the Atlas reflects the OMB’s December 2003 CBSA definitions while cities are defined by the incorporated places used in the 2010 Census (with the exception of City and County of Honolulu, which covers all of Honolulu County).

Assembling geographically-consistent census summary data for regions was relatively straightforward given that they are defined as one or more counties grouped together. While there have been a several county name and Federal Information Processing Standard (FIPS) code changes since 1980, there have been few changes to county boundaries themselves, with most of them occurring in Alaska and outside the sole Alaskan metro area included in the Atlas (Anchorage, AK). Thus, an underlying database of county-level summary files along with county-to-metro-area geographic crosswalks (based on the OMB’s December 2003 definitions) was sufficient to summarize data for regions in each year.

Assembling geographically-consistent census summary data for the 100 largest cities was a bit more complicated given that there have been more changes to the geographic boundaries of the 100 largest cities since 1980 and population counts by race/ethnicity are generally not available at a very detailed level of geography in 1980 and 1990. However, we were able to find sufficient data on people by race/ethnicity and age at the census “place” level of geography from the decennial census summary files for 1980 through 2010 (and subsequent ACS summary files). The census place level of geography consists mostly of incorporated cities and unincorporated areas referred to as Census Designated Places (CDPs), and while they do not provide full geographic coverage of U.S. as do other census geographies such as census tracts, they are generally areas of population clustering and thus tend to be geographically compact areas. They also represent the vast majority of the population in urban areas making them a suitable choice of geography to use as a “building block” to assemble historical data for the 2010 boundaries of the 100 largest cities in the U.S.

The specific approach we took was to identify all of the 100 largest cities that had significant boundary changes since 1980 by examining their boundaries in each decade in a Geographic Information System (GIS) and using shapefiles for the census place level of geography for 1980, 1990, 2000, and 2010, from the National Historical Geographic Information System (NHGIS). For any of the 100 largest cities that had expanded since 1980 (through annexation), we identified all places in 1980, 1990, and 2000 for which most of land area the area was covered by the 2010 boundaries of the city in question. Such places were assumed to have been annexed, and this was confirmed as possible through media reporting and public documents found in web searches. In each decadal year (1980 through 2000), data for the annexed places was combined with data for the city (among the 100 largest) that annexed them to produce historical data for the 100 largest cities based on their 2010 boundaries.

The one city that required special treatment was Louisville/Jefferson County metro government (balance), KY. It this case, the city of Louisville annexed all of Jefferson County except for other incorporated places (i.e., the “balance” of the county after removing incorporated places). Because much of the area that was annexed was not covered by census places that existed in 1980, 1990, and 2000, we determined that a more accurate approach to estimating the historical (pre-2010) data for the city based on its 2010 boundaries was to start with data for all of Jefferson County in each year and subtract out the data for all existing census places in 1980, 1990, and 2000 whose boundaries were found to be inside Jefferson County but mostly outside the 2010 boundaries of Louisville/Jefferson County metro government (balance). This was done using GIS software and the same NHGIS shapefiles noted above, identifying historical census places as being “outside” of the 2010 boundaries of Louisville/Jefferson County metro government (balance) if their centroids (i.e., a point representing their geographic center) fell outside of those boundaries.