US Census Data API in Mathematica#

by Vishank Patel

U.S. Census API documentation: https://www.census.gov/data/developers/about.html

U.S. Census Data Discovery Tool: https://api.census.gov/data.html

These recipe examples were tested on March 16, 2022.

See also the U.S. Census API Terms of Service

Attribution: This tutorial uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.

API Key Information#

While an API key is not required to use the U.S. Census Data API, you may consider registering for an API key as the API is limited to 500 calls a day without a key. Sign up can be found here: https://api.census.gov/data/key_signup.html

If you do not use an API Key, create an empty string: myAPIKey = "". However, if using an API key, save it to a text file and import the key as follows:

myAPIKey = Import["ADD PATH HERE"];

1. Get population estimates of counties by State#

Note: Includes Washington, D.C. and Puerto Rico

Define root Census API and get state IDs:

api = "https://api.census.gov/data/";

(*Define api url for the state ids
We will use the Population Estimates from 2019 dataset
https://api.census.gov/data/2019/pep/population/examples.html*)

stateIDsURL = api <> "2019/pep/population?get=NAME&for=state:*&key=" <> myAPIKey;
rawStateIDs = Import[stateIDsURL, "JSON"];
rawStateIDs[[;; 10]]
{{NAME, state}, {Alabama, 01}, {Alaska, 02}, {Arizona, 04}, {Arkansas, 05}, 
 
>   {California, 06}, {Colorado, 08}, {Delaware, 10}, {District of Columbia, 11}, 
 
>   {Idaho, 16}}

Remove the headers and display the first 10 elements:

stateIDs = rawStateIDs[[2 ;;]];
stateIDs[[;; 10]]
{{Alabama, 01}, {Alaska, 02}, {Arizona, 04}, {Arkansas, 05}, {California, 06}, 
 
>   {Colorado, 08}, {Delaware, 10}, {District of Columbia, 11}, {Idaho, 16}, 
 
>   {Connecticut, 09}}

Now we can loop through each state and pull their individual population data:

countyData = <||>;
For[i = 1, i <= Length[stateIDs], i++,
 
 stateName = stateIDs[[i, 1]];
 stateID = stateIDs[[i, 2]];
 
 countyImport = Import[api <> "2019/pep/population?get=NAME,POP&for=county:*&in=state:" <> stateID <> "&key=" <> myAPIKey, "JSON"][[2 ;;]][[All, {1, 2}]]; 
 (* [[2;;]] removes the headers from the import, then [[All,{1,2}]] grabs the county name and population*)
 AppendTo[countyData, stateName -> countyImport];
 Pause[1]
 ]
countyData // Short
Output

To show the counties from Alabama:

countyData["Alabama"] // Shallow
Output

2. Get population estimates over a range of years#

We can use similar code as before, but now loop through different population estimate datasets by year. Here are the specific APIs used:

Vintage 2015 Population Estimates: https://api.census.gov/data/2015/pep/population/examples.html

Vintage 2016 Population Estimates: https://api.census.gov/data/2016/pep/population/examples.html

Vintage 2017 Population Estimates: https://api.census.gov/data/2017/pep/population/examples.html

yearList = {"2015", "2016", "2017"};   

(*Make sure the years in the list are strings as we will concatenate them to the URL below*)

countyDatabyYear = <||>;

For[i = 1, i <= Length[stateIDs], i++,
 For[j = 1, j <= Length[yearList], j++,
  
  stateName = stateIDs[[i, 1]];
  stateID = stateIDs[[i, 2]];
  year = yearList[[j]];
  
  countyYearedDataURL = api <> year <> "/pep/population?get=GEONAME,POP&for=county:*&in=state:" <> stateID <> "&key=" <> myAPIKey;
  countyYearedImport = Import[countyYearedDataURL, "JSON"][[2 ;;]][[All, {1, 2}]];
  AppendTo[countyDatabyYear, stateName -> <|year -> countyYearedImport|>];
  Pause[1]
  ]
 ]
countyDatabyYear // Short
Output

3. Plot Population Change#

This data is based off the 2021 Population Estimates dataset: https://api.census.gov/data/2021/pep/population/variables.html

The percentage change in population is from July 1, 2020 to July 1, 2021 for states (includes Washington, D.C. and Puerto Rico)

popChangeURL = api <> "2021/pep/population?get=NAME,POP_2021,PPOPCHG_2021&for=state:*&key=" <> myAPIKey;
rawPopChangeData = Import[popChangeURL, "JSON"];
rawPopChangeData[[;; 10]]
{{NAME, POP_2021, PPOPCHG_2021, state}, {Oklahoma, 3986639, 0.6210955947, 40}, 
 
>   {Nebraska, 1963692, 0.1140479899, 31}, {Hawaii, 1441553, -0.7134046100, 15}, 
 
>   {South Dakota, 895376, 0.9330412953, 46}, {Tennessee, 6975218, 0.7962146316, 47}, 
 
>   {Nevada, 3143991, 0.9608001873, 32}, {New Mexico, 2115877, -0.0797613860, 35}, 
 
>   {Iowa, 3193079, 0.1383022195, 19}, {Kansas, 2934582, -0.0442116160, 20}}

Preparing the data for plotting,

popChangeData = {rawPopChangeData[[2 ;;]][[All, 1]], rawPopChangeData[[2 ;;]][[All, 3]]};
popChangeData // Shallow
Output
transposedPopChangeData = Sort[Transpose[popChangeData]];
transposedPopChangeData[[;; 10]]
{{Alabama, 0.2999918604}, {Alaska, 0.0316749062}, {Arizona, 1.3698828613}, 
 
>   {Arkansas, 0.4534511286}, {California, -0.6630474360}, {Colorado, 0.4799364073}, 
 
>   {Connecticut, 0.1482392938}, {Delaware, 1.1592057958}, 
 
>   {District of Columbia, -2.9043911470}, {Florida, 0.9791222337}}
ListPlot[
 ToExpression[transposedPopChangeData[[All, 2]]] -> transposedPopChangeData[[All, 1]],
 ImageSize -> {900, 700},
 Ticks -> {False, True}]
Output