# Research Organization Registry API in Python

by Michael T. Moen

**ROR API Documentation:** https://ror.readme.io/docs/rest-api

**ROR API License:** https://ror.readme.io/docs/ror-basics#what-is-ror

The ROR API is licensed under the Creative Commons' [CC0 license](https://creativecommons.org/publicdomain/zero/1.0/), designating its data as part of the public domain.

The Research Organization Registry (ROR) API provides persistent identifiers for research organizations.

*These recipe examples were tested on January 19, 2023.*

**_NOTE:_** The ROR API limits requests to a maximum of 2000 requests in a 5-minute period.

## Setup

### ROR Data Dump

When working with larger datasets, consider using the ROR data dump: https://ror.readme.io/docs/data-dump

### Import Libraries

This tutorial uses the following libraries:

In [1]:
import requests                     # Manages API requests
from urllib.parse import quote      # URL encodes string inputs
from time import sleep              # Allows staggering of API requests to conform to rate limits

## 1. Searching with queries

This first example uses the `query` parameter of the ROR API to search for an institution by name. In this example, we'll search for The University of Alabama:

In [2]:
# The search query is the institution name
institution = 'University of Alabama'

# Use the quote() function to URL encode our search term
url = f'https://api.ror.org/organizations?query={quote(institution)}'
response = requests.get(url).json()

# Print total number of results and number of results in page
print(f'Total number of results: {response['number_of_results']}')
print(f'Page length: {len(response['items'])}')

Total number of results: 25434
Page length: 20


The results indicate that the query produced thousands of results, but only the data for 20 institutions were returned in this query. However, the top result was exactly what we were looking for:

In [3]:
# Display data of the top search result
response['items'][0]

{'id': 'https://ror.org/03xrrjk67',
 'name': 'University of Alabama',
 'email_address': None,
 'ip_addresses': [],
 'established': 1831,
 'types': ['Education'],
 'relationships': [{'label': 'Mississippi Alabama Sea Grant Consortium',
   'type': 'Related',
   'id': 'https://ror.org/04vzsq290'},
  {'label': 'University of Alabama System',
   'type': 'Parent',
   'id': 'https://ror.org/051fvmk98'}],
 'addresses': [{'lat': 33.20984,
   'lng': -87.56917,
   'state': None,
   'state_code': None,
   'city': 'Tuscaloosa',
   'geonames_city': {'id': 4094455,
    'city': 'Tuscaloosa',
    'geonames_admin1': {'name': 'Alabama',
     'id': 4829764,
     'ascii_name': 'Alabama',
     'code': 'US.AL'},
    'geonames_admin2': {'name': 'Tuscaloosa',
     'id': 4094463,
     'ascii_name': 'Tuscaloosa',
     'code': 'US.AL.125'},
    'license': {'attribution': 'Data from geonames.org under a CC-BY 3.0 license',
     'license': 'http://creativecommons.org/licenses/by/3.0/'},
    'nuts_level1': {'name': 

The following code produces the name, ROR ID, city, and wikipedia URL of the top result of the query:

In [4]:
response['items'][0]['name']

'University of Alabama'

In [5]:
response['items'][0]['id']

'https://ror.org/03xrrjk67'

In [6]:
response['items'][0]['addresses'][0]['city']

'Tuscaloosa'

In [7]:
response['items'][0]['wikipedia_url']

'http://en.wikipedia.org/wiki/University_of_Alabama'

### Searching by alternate names

The example below uses abbreviated forms of the full names of universities when searching:

In [8]:
# List of institutions to be searched
institutions = [
    'University of Alabama Tuscaloosa',
    'Missouri',
    'Dartmouth',
    'Oxford',
    'UCLA'
]

# Send an HTTP request for each institution
for institution in institutions:

    # Use the quote() function to URL encode our search term
    url = f'https://api.ror.org/organizations?query={quote(institution)}'
    search_data = requests.get(url).json()

    # Print the search term and the name of its top result
    print(f'{institution}: {search_data['items'][0]['name']}')

    # Stagger requests to be nicer on the ROR servers
    sleep(0.5)

University of Alabama Tuscaloosa: University of Alabama
Missouri: Missouri Southern State University
Dartmouth: Dartmouth Psychiatric Research Center
Oxford: Stockholm Environment Institute
UCLA: Universidad Centroccidental Lisandro Alvarado


The top results of the queries above are probably not what you would have expected. The example below remedies these issues by having more clearly defined search strings:

In [9]:
# List of institutions to be searched
institutions = [
    'University of Alabama Tuscaloosa',
    'University of Missouri',
    'Dartmouth College',
    'University of Oxford',
    'University of California Los Angeles'
]

# Send an HTTP request for each institution
for institution in institutions:

    # Use the quote() function to URL encode our search term
    url = f'https://api.ror.org/organizations?query={quote(institution)}'
    search_data = requests.get(url).json()

    # Print the search term and the name of its top result
    print(f'{institution}: {search_data['items'][0]['name']}')

    # Stagger requests to be nicer on the ROR servers
    sleep(0.5)

University of Alabama Tuscaloosa: University of Alabama
University of Missouri: University of Missouri
Dartmouth College: Dartmouth College
University of Oxford: University of Oxford
University of California Los Angeles: University of California, Los Angeles


## 2. Searching with filters

The ROR API also allows searches to be performed with the `filter` parameter, which can take 3 arguments: `status`, `types`, and `country`. For more information on what values these arguments can take, read the ROR documentation here: https://ror.org/tutorials/intro-ror-api/#filtering-results

In [10]:
# Filters are separated by commas
filters = ','.join([
    f'country.country_name:{quote("United States")}',
    'types:Education',
    'status:Active'
])

# URL constructed with the filters
url = f'https://api.ror.org/organizations?filter={filters}'
response = requests.get(url).json()

# Display number of results
response['number_of_results']

4296

### Paging through a result

The example below pages through the results to find the names and ROR IDs of the first 100 institutions returned using the filter:

In [11]:
# Filters to are separated by commas
filters = ','.join([
    f'country.country_name:{quote("United States")}',
    'types:Education',
    'status:Active'
])

# URL constructed with the filters
url = f'https://api.ror.org/organizations?filter={filters}'
response = requests.get(url).json()

# Calculate number of pages in result
total_pages = (response['number_of_results'] // len(response['items'])) + 1

# Store resulting names in a dictionary
institution_rors = {}

# Limited to first 5 pages for this tutorial
for page_number in range(total_pages)[:5]:

    # Use the quote() function to URL encode our search term
    url = f'https://api.ror.org/organizations?filter={filters}&page={page_number+1}'
    search_data = requests.get(url).json()

    # Add institution names and ROR IDs to the institution_results list
    for result in search_data['items']:
        institution_rors[result['name']] = result['id']

    # Stagger requests to be nicer on the ROR servers
    sleep(0.5)

# Display first 100 results
for name, ror_id in sorted(institution_rors.items()):
    print(f'{name}: {ror_id}')

American Society for Microbiology: https://ror.org/04xsjmh40
Austin College: https://ror.org/052k56z27
Austin Community College: https://ror.org/044tz3m61
Bacone College: https://ror.org/03rvph505
Bakersfield College: https://ror.org/00tz9e151
Baltimore City Community College: https://ror.org/03b286288
Bank Street College of Education: https://ror.org/04v44vh53
Bay Mills Community College: https://ror.org/005m1kj18
Bellingham Technical College: https://ror.org/03hjr5c64
Bishop State Community College: https://ror.org/036vmnd51
Bismarck State College: https://ror.org/042fn4q48
Bittersweet Farms: https://ror.org/02j7zcz08
Blackfeet Community College: https://ror.org/02kgm7r43
Bloomfield College: https://ror.org/04gdk0x87
Bluffton University: https://ror.org/05pacqm37
Boston Public Schools: https://ror.org/03bj7kr91
Bristol Community College: https://ror.org/05e168j76
Burlington School District: https://ror.org/00zbcp540
Calallen Independent School District: https://ror.org/00p5kvd83
Cali

The resulting dictionary can be used to find the ROR of an institution based on its name:

In [12]:
institution_rors['Xavier University']

'https://ror.org/00f266q65'

## 3. Searching with queries and filters

The `filter` and `query` parameters can both be used in a single request. In this example, we filter the results of the query "Birmingham" to only include institutions from the United States:

In [13]:
# Filter results to the United States
filter = f'country.country_name:{quote("United States")}'

# Search term
query = 'Birmingham'

# URL constructed with the filters
url = f'https://api.ror.org/organizations?query={query}&filter={filter}'
response = requests.get(url).json()

# Display number of results
response['number_of_results']

12

In [14]:
# Filter results to the United States
filter = f'country.country_name:{quote("United States")}'

# Search term
query = 'Birmingham'

# URL constructed with the filters
url = f'https://api.ror.org/organizations?query={query}&filter={filter}'
response = requests.get(url).json()

# Calculate number of pages in result
total_pages = (response['number_of_results'] // len(response['items'])) + 1

# Store resulting names in a dictionary
institution_rors = {}

# Limited to first 5 pages for this tutorial
for page_number in range(total_pages):

    # Use the quote() function to URL encode our search term
    url = f'https://api.ror.org/organizations?query={query}&filter={filter}&page={page_number+1}'
    search_data = requests.get(url).json()

    # Add institution names and ROR IDs to the institution_results list
    for result in search_data['items']:
        institution_rors[result['name']] = result['id']

    # Stagger requests to be nicer on the ROR servers
    sleep(0.5)

# Display first 100 results
for name, ror_id in sorted(institution_rors.items()):
    print(f'{name}: {ror_id}')

Alabama Audubon: https://ror.org/02qbyex13
Birmingham Bloomfield Community Coalition: https://ror.org/004mx7t23
Birmingham Civil Rights Institute: https://ror.org/00fqce595
Birmingham Museum of Art: https://ror.org/030y6zg68
Birmingham Public Library: https://ror.org/05czff141
Birmingham VA Medical Center: https://ror.org/0242qs713
Birmingham–Southern College: https://ror.org/006g42111
St. Vincent's Birmingham: https://ror.org/000crk757
UAB Medicine: https://ror.org/036554539
University of Alabama at Birmingham: https://ror.org/008s83205
University of Alabama at Birmingham Hospital: https://ror.org/01rm42p40
Vision Specialists of Michigan: https://ror.org/02awhp844
