Scopus API in Python#
by Vincent F. Scalfani
These recipe examples use the Elsevier Scopus API and the Python Scopus API-wrapper package, pybliometrics. Code was tested and sample data downloaded from the Scopus API on February 16, 2022 via http://api.elsevier.com and http://www.scopus.com. This tutorial content is intended to help facillitate academic research. Before continuing or reusing any of this code, please be aware of Elsevier’s API policies and appropiate use-cases. You will also need to register for an API key in order to use the Scopus API.
1. Initial Pybliometrics Setup#
The first time you run import pybliometrics
, it will prompt you for your Elsevier Scopus API Key,
which is then saved to a local config file. See the documentation:
https://pybliometrics.readthedocs.io/en/stable/configuration.html
import pybliometrics
# import other libraries needed
from pybliometrics.scopus import ScopusSearch
import time
import numpy as np
import pandas as pd
4. Get References via a Title Search#
Number of Title Match Records#
# Search Scopus for all references containing 'ChemSpider' in the record title
q2 = ScopusSearch('TITLE(ChemSpider)',download=False)
q2.get_results_size()
7
# repeat this in a loop
titleWord_list = ['ChemSpider', 'PubChem', 'ChEMBL', 'Reaxys', 'SciFinder']
# get number of Scopus records for each title search
num_records_title = []
for titleWord in titleWord_list:
# query search
qt = ScopusSearch('TITLE' +'(' + titleWord + ')',download=False)
numt = qt.get_results_size()
# compile saved scopus data into a list of lists
num_records_title.append([titleWord,numt])
# delay one second between api calls to be nice to Elsevier servers
time.sleep(1)
num_records_title
[['ChemSpider', 7],
['PubChem', 79],
['ChEMBL', 53],
['Reaxys', 8],
['SciFinder', 30]]
Download Title Match Record Data#
# download records and create a list of selected metadata
titleWord_list = ['ChemSpider', 'PubChem', 'ChEMBL', 'Reaxys', 'SciFinder']
scopus_title_data = []
for titleWord in titleWord_list:
# query search
qt = ScopusSearch('TITLE' +'(' + titleWord + ')')
# create the dataframe
qt_df = pd.DataFrame(qt.results)
# save DOIs to a list
doi = qt_df.doi.tolist()
# save title to a list
title = qt_df.title.tolist()
# save coverDate to a list
coverDate = qt_df.coverDate.tolist()
# compile saved scopus_title_data into a list of lists
scopus_title_data.append([titleWord, doi, title, coverDate])
# delay one second between api calls to be nice to Elsevier servers
time.sleep(1)
# create a flat list of scopus_title_data
scopus_title_data_flat = []
for titleWord in range(len(scopus_title_data)):
for doi in range(len(scopus_title_data[titleWord][1])):
scopus_title_data_flat.append([scopus_title_data[titleWord][0], # titleWord
scopus_title_data[titleWord][1][doi], # doi
scopus_title_data[titleWord][2][doi], # title
scopus_title_data[titleWord][3][doi]]) # coverdate
# add to dataFrame
scopus_title_data_df = pd.DataFrame(scopus_title_data_flat)
scopus_title_data_df.rename(columns={0:"titleWord",1: "doi",2: "title", 3: "coverDate"},
inplace=True)
scopus_title_data_df
titleWord | doi | title | coverDate | |
---|---|---|---|---|
0 | ChemSpider | 10.1039/c5np90022k | Editorial: ChemSpider-a tool for Natural Produ... | 2015-08-01 |
1 | ChemSpider | 10.1021/bk-2013-1128.ch020 | ChemSpider: How a free community resource of d... | 2013-01-01 |
2 | ChemSpider | 10.1007/s13361-011-0265-y | Identification of "known unknowns" utilizing a... | 2012-01-01 |
3 | ChemSpider | 10.1002/9781118026038.ch22 | Chemspider: A Platform for Crowdsourced Collab... | 2011-05-03 |
4 | ChemSpider | 10.1021/ed100697w | Chemspider: An online chemical information res... | 2010-11-01 |
... | ... | ... | ... | ... |
172 | SciFinder | 10.1021/ci0003808 | Strategies for chemical reaction searching in ... | 2000-01-01 |
173 | SciFinder | 10.1002/nadc.19990471212 | SciFinder scholar - Ein erster erfahrungsbericht | 1999-01-01 |
174 | SciFinder | 10.1021/cen-v074n025.p043 | Chemical abstracts service launches release 2.... | 1996-01-01 |
175 | SciFinder | None | Scientists online at their desktops SciFinder | 1996-01-01 |
176 | SciFinder | None | SciFinder from CAS: Information at the desktop... | 1995-07-01 |
177 rows × 4 columns