Springer Nature API in C

Springer Nature API in C#

by Cyrus Gomes

These recipe examples use the Springer Nature Open Access API to retrieve metadata and full-text content. About 650,000 full-text are available from BioMed Central and SpringerOpen Journals: https://dev.springernature.com/docs

An API key is required to access the Springer Nature API, sign up can be found at https://dev.springernature.com/

Code was tested on January, 2023. This tutorial content is intended to help facillitate academic research. Before continuing or reusing any of this code, please be aware of the Springer Nature Text and Data Mining Policies, Terms and Conditions, and Terms for API Users:

Setup#

First, install the CURL and jq packages by typing the following command in the terminal:

!sudo apt install curl jq

Then, we set a directory where we want the Science_Direct directory for our projects to be created:

!mkdir Springer

Finally, we change the directory to the folder we created;

%cd Springer

Import API Key#

We store our API key in a separate file for easy access and security. (Input the Api Key in this file)

# Create the key file
!touch "apiKey.txt"

Create a variable for API Key#

Save your API key to a separate text file (copy and paste / write the key), then create a variable for your key. Avoid displaying your API key in your terminal (to prevent accidental sharing).

We use the following command to access the key as Jupyter does not allow variable sharing for bash scripts.

# Read the key from the file
!apiKey=$(cat "apiKey.txt")

Create an executable for API calls#

We utilize the %%file command to create the following makefile which will compile our program and create an executable.

%%file makefile

# Set the variable CC to gcc, which is used to build the program
CC=gcc

# Enable debugging information and enable all compiler warnings
CFLAGS=-g -Wall

# Set the bin variable as the name of the binary file we are creating
BIN=api_call

# Create the binary file with the name we put
all: $(BIN)

# Map any file ending in .c to a binary executable. 
# "$<" represents the .c file and "$@" represents the target binary executable
%: %.c

	# Compile the .c file using the gcc compiler with the CFLAGS and links 
	# resulting binary with the CURL library
	$(CC) $(CFLAGS) $< -o $@ -lcurl

# Clean target which removes specific files
clean:

	# Remove the binary file and an ".dSYM" (debug symbols for debugging) directories
	# the RM command used -r to remove directories and -f to force delete
	$(RM) -rf $(BIN) *.dSYM

Writing makefile

This %%file command is used again to create our .c file which contains the code for the program:

%%file api_call.c

#include <curl/curl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* CURL program that retrieves Science Direct data from
  https://api.springernature.com/openaccess/jats */

int main (int argc, char* argv[]){
    
    // If arguments are invalid then return
    if (argc < 2) {                                                                                      
        printf("Error. Please try again correctly. (./api_call -doi [doi] -key [key])\n");
        return -1;
    }
    
    // Initialize the CURL HTTP connection
    CURL *curl = curl_easy_init();

    // Bits of the url that are joined together later                                                                      
    char api[] = "https://api.springernature.com/openaccess/jats?q=doi:";                            
    char url[1000];
    char label1[] = "&openaccess=true&api_key=";
    char doi[] = "10.1007/s40708-014-0001-z";

    // Check if CURL initialization is a success
    if (!curl) {                                                                                         
        fprintf(stderr, "init failed\n");
        return EXIT_FAILURE;
    }
    
    /* Here are different ways of calling the program in the
    command line and integrating doi and parameter fields.*/
    
    // Has the -key flag and the key field: ./api_call -key [key]
    if ((argc==3) && (strcmp(argv[1],"-key")==0)) {

        // Combine the API, default DOI, and key to produce a functioning URL
        sprintf(url, "%s%s%s%s", api, doi, label1, argv[2]);
    
    }
    
    // Has the -doi and -key flags and the key field: ./api_call -doi -key [key]
    else if ((argc==4) && (strcmp(argv[2],"-key")==0) && (strcmp(argv[1],"-doi")==0)) {
        
        // Combine the API, default DOI, and key to produce a functioning URL
        sprintf(url, "%s%s%s%s", api, doi, label1, argv[3]);                                              
    
    }
    
    // Has the -key and -doi flags and the key and doi field: ./api_call -key [key] -doi [doi] 
    else if ((argc==5) && (strcmp(argv[1],"-key")==0) && (strcmp(argv[3],"-doi")==0)) {
        
        // Combine the API, custom DOI, and key to produce the url
        sprintf(url, "%s%s%s%s", api, argv[4], label1,  argv[2]);                                            
    
    }
    
    // Has the -doi and -key flags and the doi and key field: ./api_call -doi [doi] -key [key] 
    else if ((argc==5) && (strcmp(argv[3],"-key")==0)) {
        
        // Combine the API, custom DOI, and key to produce the URL
        sprintf(url, "%s%s%s%s", api, argv[2], label1, argv[4]);                                              
    
    }
    
    // If the arguments are invalid then return
    else {        
        printf("Please use ./api_call -doi [doi] -key [key]\n");                                                                                      
        curl_easy_cleanup(curl);
        return 0;
    }                                            

    // Set the url to which the HTTP request will be sent to
    // First parameter is for the initialized curl HTTP request, second for the option to be set, and third for the value to be set
    curl_easy_setopt(curl, CURLOPT_URL, url);

    // If result is not retrieved then output error
    CURLcode result = curl_easy_perform(curl);

    // If result is not retrieved then output error
    if (result != CURLE_OK) {                                                                            
        fprintf(stderr, "download problem: %s\n", curl_easy_strerror(result));
    }

    // Deallocate memory for the CURL connection
    curl_easy_cleanup(curl);                                                                            
    return EXIT_SUCCESS;
}

Overwriting api_call.c

!make

gcc -g -Wall api_call.c -o api_call -lcurl

1. Retrieve full-text JATS XML of an article#

Before we can query, we must establish a few things:

base_url: The base url for the Springer API, more specifically the open access API with JATS format: https://jats.nlm.nih.gov/archiving/tag-library/1.1/index.html
?q=doi:: The query parameter, in this case we are searching for a DOI
doi: The DOI of the article
openaccess:true: This requests content through the openaccess API
&api_key=: This the text for the api key

You can read more about the API parameters at https://dev.springernature.com/restfuloperations

%%bash

# Algorithm to download jats file by calling API
# Uses example DOI from SpringerOpen Brain Informatics

# Store the key in the key variable
key=$(cat apiKey.txt)

# Call the program using a doi and key and assign it to a variable
fulltext1=$(./api_call -key "$key" -doi "10.1007/s40708-014-0001-z")

# Save output to fulltext.jats
echo "$fulltext1" > fulltext.jats

2. Retrieve full-text in a loop#

In this example, we retrieve article full text for each DOI in a loop and save each article to a separate file.

%%bash

# Example DOIs from SprigerOpen Brain Informatics
dois=("10.1007/s40708-014-0001-z",
    "10.1007/s40708-014-0002-y",
    "10.1007/s40708-014-0003-x",
    "10.1007/s40708-014-0004-9",
    "10.1007/s40708-014-0005-8")

# Store the key in the key variable
key=$(cat apiKey.txt)

# Call the program using a DOI and assign it to a variable
for doi in "${dois[@]}"; do
    
    # Can't save files with a '/' character on Linux
    filename=$(echo "$doi" | tr '/' '_')
    
    # Concatenate "_plain_text.txt" to the filename
    filename="${filename}_plain_text.txt"
    
    # -key [key] can also be used to input the key to program
    # ./api_call -doi "$doi" -key "$key"
    
    # Call the program using a doi and assign it to a variable
    article=$(./api_call -doi "$doi", -key "$key")
    
    # Save the output to a .txt file
    echo "$article" > "$filename.txt"

done

3. Acquire and Parse Metadata#

We can also acquire only the metadata as JSON text.

We change the api url in the program and retrieve the JSON data.

%%file api_call.c

#include <curl/curl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* CURL program that retrieves Science Direct data from
  https://api.springernature.com/openaccess/json */

int main (int argc, char* argv[]) {
    
    // If arguments are invalid then return
    if (argc < 2) {                                                                                      
        printf("Error. Please try again correctly. (./api_call -doi [doi] -key [key])\n");
        return -1;
    }
    
    // Initialize the CURL HTTP connection
    CURL *curl = curl_easy_init();

    // Bits of the url that are joined together later                                                                      
    char api[] = "https://api.springernature.com/openaccess/json?q=doi:";                            
    char url[1000];
    char label1[] = "&openaccess=true&api_key=";
    char doi[] = "10.1007/s40708-014-0001-z";

    // Check if CURL initialization is a success
    if (!curl) {                                                                                         
        fprintf(stderr, "init failed\n");
        return EXIT_FAILURE;
    }
    
    /* Here are different ways of calling the program in the
    command line and integrating doi and parameter fields.*/
    
    // Has the -key flag and the key field: ./api_call -key [key]
    if ((argc==3) && (strcmp(argv[1],"-key")==0)) {
        
        // Combine the API, default DOI, and key to produce a functioning URL
        sprintf(url, "%s%s%s%s", api, doi, label1, argv[2]);            
    
    }
    
    // Has the -doi and -key flags and the key field: ./api_call -doi -key [key]
    else if ((argc==4) && (strcmp(argv[2],"-key")==0) && (strcmp(argv[1],"-doi")==0)) {
        
        // Combine the API, default DOI, and key to produce a functioning URL
        sprintf(url, "%s%s%s%s", api, doi, label1, argv[3]);                                              
    
    }
    
    // Has the -key and -doi flags and the key and doi field: ./api_call -key [key] -doi [doi] 
    else if ((argc==5) && (strcmp(argv[1],"-key")==0) && (strcmp(argv[3],"-doi")==0)) {
        
        // Combine the API, custom DOI, and key to produce the URL
        sprintf(url, "%s%s%s%s", api, argv[4], label1,  argv[2]);                                            
    
    }
    
    // Has the -doi and -key flags and the doi and key field: ./api_call -doi [doi] -key [key] 
    else if ((argc==5) && (strcmp(argv[3],"-key")==0)) {
        
        // Combine the API, custom DOI, and key to produce the URL
        sprintf(url, "%s%s%s%s", api, argv[2], label1, argv[4]);                                              
    
    }
    
    // If the arguments are invalid then return
    else {        
        printf("Please use ./api_call -doi [doi] -key [key]\n");                                                                                      
        curl_easy_cleanup(curl);
        return 0;
    }                                            

    // Set the URL to which the HTTP request will be sent
    // First parameter is for the initialized curl HTTP request, second for the option to be set, and third for the value to be set
    curl_easy_setopt(curl, CURLOPT_URL, url);

    // If result is not retrieved then output error
    CURLcode result = curl_easy_perform(curl);

    // If result is not retrieved then output error
    if (result != CURLE_OK) {                                                                            
        fprintf(stderr, "download problem: %s\n", curl_easy_strerror(result));
    }

    // Deallocate memory for the CURL connection
    curl_easy_cleanup(curl);                                                                            
    return EXIT_SUCCESS;
}

Overwriting api_call.c

!make

gcc -g -Wall api_call.c -o api_call -lcurl

%%bash

# Algorithm to download JSON file by calling API
# Uses example DOI from SpringerOpen Brain Informatics

# Store the key in the key variable
key=$(cat apiKey.txt)

# Call the program using a DOI and key and assign it to a variable
fulltext2=$(./api_call -key "$key" -doi "10.1007/s40708-014-0001-z")

# Save the output to fulltext.jats
echo "$fulltext2" > fulltext.json

We can now extract data out of ["records"][0], where all the data is stored for the article

%%bash

data=$(cat fulltext.json)

# Some examples
echo "$data" | jq '.["apiMessage"]'
echo "$data" | jq '.["query"]'
echo "$data" | jq '.["records"][0]["abstract"]'
echo "$data" | jq '.["records"][0]["doi"]'
echo "$data" | jq '.["records"][0]["onlineDate"]'
echo "$data" | jq '.["records"][0]["printDate"]'
echo "$data" | jq '.["records"][0]["publicationName"]'
echo "$data" | jq '.["records"][0]["title"]'

"This JSON was provided by Springer Nature"
"doi:10.1007/s40708-014-0001-z"
{
  "h1": "Abstract",
  "p": "Big data is the term for a collection of datasets so huge and complex that it becomes difficult to be processed using on-hand theoretical models and technique tools. Brain big data is one of the most typical, important big data collected using powerful equipments of functional magnetic resonance imaging, multichannel electroencephalography, magnetoencephalography, Positron emission tomography, near infrared spectroscopic imaging, as well as other various devices. Granular computing with multiple granular layers, referred to as multi-granular computing (MGrC) for short hereafter, is an emerging computing paradigm of information processing, which simulates the multi-granular intelligent thinking model of human brain. It concerns the processing of complex information entities called information granules, which arise in the process of data abstraction and derivation of information and even knowledge from data. This paper analyzes three basic mechanisms of MGrC, namely granularity optimization, granularity conversion, and multi-granularity joint computation, and discusses the potential of introducing MGrC into intelligent processing of brain big data."
}

"10.1007/s40708-014-0001-z"
"2014-09-06"
"2015-01-30"
"Brain Informatics"
"Granular computing with multiple granular layers for brain big data processing"