Scopus API in Matlab#
by Anastasia Ramig
These recipe examples use the Elsevier Scopus API. Code was tested with MATLAB R2021b and sample data downloaded from the Scopus API on April 26, 2022 via http://api.elsevier.com and http://www.scopus.com. This tutorial content is intended to help facillitate academic research. Before continuing or reusing any of this code, please be aware of Elsevier’s API policies and appropiate use-cases. You will also need to register for an API key in order to use the Scopus API.
Setup#
We will start by setting up the API key. Save your key in a text file in the same directory as the current Matlab folder and import your key as follows:
%% import API key from file
myAPIKey = importdata("apiKey.txt");
3. Get References via a Title Search#
Number of Title Match Records#
Search Scopus for all references containing ‘ChemSpider” in the record title.
%% set up the API information
api_url = "https://api.elsevier.com/content/search/scopus?query=";
author_id = "TITLE(ChemSpider)&apiKey=";
%% find the information for ChemSpider and get the total number of results
q3 = webread(api_url + author_id + myAPIKey);
q3.search_results.opensearch_totalResults
Repeat this in a loop to get number of Scopus records for each title search.
%% create a list of titles
titleList = ["ChemSpider", "PubChem", "ChEMBL", "Reaxys", "SciFinder"];
length(titleList)
%% create an array of ones to pre-allocate numRecordsTitle
clear numRecordsTitle
numRecordsTitle = {ones(length(titleList), 1)};
%% obtain the number of records for each title in the list and create an array
for i = 1:length(titleList)
qt = webread(api_url + "TITLE(" + titleList(i) + ")&apiKey=" + myAPIKey);
numt = qt.search_results.opensearch_totalResults;
numRecordsTitle{1, i}{1, 1} = titleList(i);
numRecordsTitle{1, i}{1, 2} = numt;
pause(1)
end
Download Title Match Record Data#
Download records and create a list of selected metadata.
%% create a list of titles and preallocate an array
titleList = ["ChemSpider", "PubChem", "ChEMBL", "Reaxys", "SciFinder"];
scopusTitleData = {ones(length(titleList), 1)};
%% find the dois, titles, and dates for each title in the list and put them into an array
for t = 1:length(titleList)
qt = webread(api_url + "TITLE(" + titleList(t) + ")&apiKey=" + myAPIKey);
n = length(qt.search_results.entry);
doiTitles = cell(1, length(titleList));
titles = cell(1, length(titleList));
dates = cell(1, length(titleList));
for k = 1:n
try
doiTitles{1, t}{k, 1} = qt.search_results.entry{k, 1}.prism_doi;
titles{1, t}{k, 1} = qt.search_results.entry{k, 1}.dc_title;
dates{1, t}{k, 1} = qt.search_results.entry{k, 1}.prism_coverDate;
catch
end
end
pause(1)
infoTitles{1, 1}{1, t} = doiTitles{1, t};
infoTitles{2, 1}{1, t} = titles{1, t};
infoTitles{3, 1}{1, t} = dates{1, t};
end
%% create an overall array of the information found above
titleDois = {};
titlesFinal = {};
datesFinal = {};
for t = 1:width(info{1, 1})
titleDois = vertcat(titleDois, infoTitles{1, 1}{1, t});
titlesFinal = vertcat(titlesFinal, infoTitles{2, 1}{1, t});
datesFinal = vertcat(datesFinal, infoTitles{3, 1}{1, t});
end
titleArray = horzcat(titleDois, titlesFinal);
titleArray = horzcat(titleArray, datesFinal);
%% create an array of names and add it to the overall array
titlesNameArray = {};
for t = 1:length(titleList)
nameLength = length(infoTitles{1, 1}{1, t});
titlesAuthorName = cellstr(repmat(titleList(t), nameLength, 1));
titlesNameArray = vertcat(titlesNameArray, titlesAuthorName);
end
titleArray = horzcat(titleArray, titlesNameArray)
Output:
titleArray = 88x4 cell
1 2 3 4
1 '10.1039/c5np90022k' 'Editorial: ChemSpider-a tool for Natural Products research' '2015-08-01' 'ChemSpider'
2 '10.1021/bk-2013-1128.ch020' 'ChemSpider: How a free community resource of data can support the teaching of nmr spectroscopy' '2013-01-01' 'ChemSpider'
3 '10.1007/s13361-011-0265-y' 'Identification of "known unknowns" utilizing accurate mass data and chemspider' '2012-01-01' 'ChemSpider'
4 '10.1002/9781118026038.ch22' 'Chemspider: A Platform for Crowdsourced Collaboration to Curate Data Derived From Public Compound Databases' '2011-05-03' 'ChemSpider'
5 '10.1021/ed100697w' 'Chemspider: An online chemical information resource' '2010-11-01' 'ChemSpider'
6 '10.1016/j.bioorg.2022.105648' 'Structure-based discovery of a specific SHP2 inhibitor with enhanced blood–brain barrier penetration from PubChem database' '2022-04-01' 'PubChem'
7 '10.1016/j.jmb.2022.167514' 'PubChem Protein, Gene, Pathway, and Taxonomy Data Collections: Bridging Biology and Chemistry through Target-Centric Views of PubChem Data' '2022-01-01' 'PubChem'
8 '10.1007/s40011-021-01335-x' 'Identification a Novel Inhibitor for Aldo–Keto Reductase 1 C3 by Virtual Screening of PubChem Database' '2022-01-01' 'PubChem'
9 '10.1007/978-1-0716-2067-0_27' 'Plant Reactome and PubChem: The Plant Pathway and (Bio)Chemical Entity Knowledgebases' '2022-01-01' 'PubChem'
10 '10.1016/j.molstruc.2021.130968' '3CL<sup>pro</sup> and PL<sup>pro</sup> affinity, a docking study to fight COVID19 based on 900 compounds from PubChem and literature. Are there new drugs to be found?' '2021-12-05' 'PubChem'
11 '10.1093/glycob/cwab078' 'Enhancing the interoperability of glycan data flow between ChEBI, PubChem and GlyGen' '2021-11-01' 'PubChem'
...
...
...