World Bank API in Python

World Bank API in Python#

by Avery Fernandez

These recipe examples were tested on February 13, 2022

1. Get list of country iso2Codes and names#

First, import libraries needed to pull data from the API:

from time import sleep
import requests
from pprint import pprint

For obtaining data from the World Bank API, it is helpful to first obtain a list of country codes and names.

# define root WorldBank API
api = 'https://api.worldbank.org/v2/'

# define api url for getting country code data
country_url = api + 'country/?format=json&per_page=500'

# read the url and import data as JSON data
country_data = requests.get(country_url).json()[1]
pprint(country_data[0]) # shows first bit of data
print(len(country_data)) # shows the size of data

{'adminregion': {'id': '', 'iso2code': '', 'value': ''},
 'capitalCity': 'Oranjestad',
 'id': 'ABW',
 'incomeLevel': {'id': 'HIC', 'iso2code': 'XD', 'value': 'High income'},
 'iso2Code': 'AW',
 'latitude': '12.5167',
 'lendingType': {'id': 'LNX', 'iso2code': 'XX', 'value': 'Not classified'},
 'longitude': '-70.0167',
 'name': 'Aruba',
 'region': {'id': 'LCN',
            'iso2code': 'ZJ',
            'value': 'Latin America & Caribbean '}}
299

# Extract out iso2code from countries data
country_iso2Code = []
for isos in range(len(country_data)):
    country_iso2Code.append(country_data[isos]["iso2Code"])
pprint(country_iso2Code[0:10]) # shows first 10
print(len(country_iso2Code)) # shows the size of data

['AW', 'ZH', 'AF', 'A9', 'ZI', 'AO', 'AL', 'AD', '1A', 'AE']
299

# Extract out country names
country_name = []
for names in range(len(country_data)):
    country_name.append(country_data[names]["name"])
pprint(country_name[0:10]) # shows first 10
print(len(country_name)) # shows the size of data

['Aruba',
 'Africa Eastern and Southern',
 'Afghanistan',
 'Africa',
 'Africa Western and Central',
 'Angola',
 'Albania',
 'Andorra',
 'Arab World',
 'United Arab Emirates']
299

# now combine country_iso2Code and country name
country_iso2code_name = {country_iso2Code[i]: country_name[i] for i in range(len(country_iso2Code))}
print(len(country_iso2code_name)) # shows the size of data

Now we know the country iso2Codes which we can use to pull specific indicator data for countries.

2. Compile a Custom Indicator Dataset#

There are many availabe indicators: https://data.worldbank.org/indicator

We wll select three indicators for this example:

Scientific and Technical Journal Article Data = IP.JRN.ARTC.SC
Patent Applications, residents = IP.PAT.RESD
GDP per capita (current US$) Code = NY.GDP.PCAP.CD

Note that these three selected indictaors have a CC-BY 4.0 license We will compile this indicator data for the United States (US) and United Kingdom (GB)

indicators = ['IP.JRN.ARTC.SC','IP.PAT.RESD','NY.GDP.PCAP.CD']

Generate the web API urls we need for U.S and retrieve the data.

US_api_URL = {}
US_indicator_data = {}
for number in range(len(indicators)):
    US_api_URL = api + 'country/US/indicator/' + indicators[number] + '/?format=json&per_page=500'
    US_indicator_data[number] = requests.get(US_api_URL).json()
    sleep(1)

Generate web API urls we need for the UK (GB)

UK_api_URL = {}
UK_indicator_data = {}
for number in range(len(indicators)):
    UK_api_URL = api + 'country/GB/indicator/' + indicators[number] + '/?format=json&per_page=500'
    UK_indicator_data[number] = requests.get(UK_api_URL).json()
    sleep(1)

Now we need to extract the data and compile for analysis.

column 1: year

column 2: Scientific and Technical Journal Article Data = IP.JRN.ARTC.SC

column 3: Patent Applications, residents = IP.PAT.RESD

column 4: GDP per capita (current US$) Code = NY.GDP.PCAP.CD

NOTE: float(x or ‘nan’) is used to get rid of empty cells.

# US Data compilation
US_data = {}
for years in range(len(US_indicator_data[0][1])):
    US_data[int(US_indicator_data[0][1][years]["date"])] = [float(US_indicator_data[0][1][years]["value"] or 'nan'), 
                                                            float(US_indicator_data[1][1][years]["value"] or 'nan'), 
                                                            float(US_indicator_data[2][1][years]["value"] or 'nan')]
pprint(US_data)

{1960: [nan, nan, 3007.12344537862],
[nan, nan, 3066.56286916615],
[nan, nan, 3243.84307754988],
[nan, nan, 3374.51517105082],
[nan, nan, 3573.94118474743],
[nan, nan, 3827.52710972039],
[nan, nan, 4146.31664631665],
[nan, nan, 4336.42658722171],
[nan, nan, 4695.92339043178],
[nan, nan, 5032.14474262003],
[nan, nan, 5234.2966662115],
[nan, nan, 5609.38259952519],
[nan, nan, 6094.01798986165],
[nan, nan, 6726.35895596695],
[nan, nan, 7225.69135952566],
[nan, nan, 7801.45666356443],
[nan, nan, 8592.25353727612],
[nan, nan, 9452.57651914511],
[nan, nan, 10564.9482220275],
[nan, nan, 11674.1818666548],
[nan, 62098.0, 12574.7915062163],
[nan, 62404.0, 13976.10539252],
[nan, 63316.0, 14433.787727053],
[nan, 59391.0, 15543.8937174925],
[nan, 61841.0, 17121.2254849995],
[nan, 63673.0, 18236.8277265009],
[nan, 65195.0, 19071.2271949295],
[nan, 68315.0, 20038.9410992658],
[nan, 75192.0, 21417.0119305191],
[nan, 82370.0, 22857.1544330056],
[nan, 90643.0, 23888.6000088133],
[nan, 87955.0, 24342.2589048189],
[nan, 92425.0, 25418.9907763319],
[nan, 99955.0, 26387.2937338171],
[nan, 107233.0, 27694.853416234],
[nan, 123962.0, 28690.8757013347],
[nan, 106892.0, 29967.7127181749],
[nan, 119214.0, 31459.1389804773],
[nan, 134733.0, 32853.6769523009],
[nan, 149251.0, 34513.5615037271],
[304781.56, 164795.0, 36334.9087770589],
[305612.91, 177513.0, 37133.2428088526],
[319307.62, 184245.0, 38023.1611144021],
[329398.86, 188941.0, 39496.4858751381],
[353853.49, 189536.0, 41712.8010675545],
[384572.94, 207867.0, 44114.7477810544],
[385515.0, 221784.0, 46298.7314440927],
[391909.59, 241347.0, 47975.9676958038],
[393978.95, 231588.0, 48382.5584490552],
[399350.31, 224912.0, 47099.9804711343],
[408817.1, 241977.0, 48466.6576026922],
[423958.81, 247750.0, 49882.5581321495],
[427996.8, 268782.0, 51602.9310457907],
[429570.05, 287831.0, 53106.5367672165],
[433192.28, 285096.0, 55049.9883272312],
[429988.89, 288335.0, 56863.3714957652],
[427264.63, 295327.0, 58021.4004997125],
[432216.49, 293904.0, 60109.6557260477],
[422807.71, 285095.0, 63064.4184096731],
[nan, 285113.0, 65279.5290260953],
[nan, nan, 63413.5138584508]}

UK Data extraction