Services

API

CORE harvests, maintains, enriches and makes available metadata and full-text content (typically a PDF) from many Open Access journals and repositories. This makes it a useful access point for those who would like to develop applications making use of this content. To support these activities, CORE is providing a free API.

Need to cite CORE? Visit our research page.

Documentation

The documentation, along with live examples can be found here

Expected use

We expect the API can be used, for example, to:

  • Perform text mining to enrich metadata of Open Access publications or to even perform different kinds of semantic analysis of publications.
  • Semantically annotate (by means of crowdsourcing, collaborative sharing or natural language processing) the publications to drive the emergence of nano-publications in certain research fields.
  • Link publications to research data.
  • Carry out impact and citation analysis in the Open Access domain.
  • Many other services that need quick and easy access to the content of research publications, etc.

API key registration

Please register here.

Quota

We apply a quota to the API to allow fair access and a high response time to our services. Please get in touch if you require accessing our API at a faster rate.
The quota for each method are listed in the following tables:

Global methods

MethodRequest typeLimit
/searchbatch1 requests per 10 seconds
/search/{query}single5 requests per 10 seconds

Article methods

MethodRequest typeLimit
/articles/getbatch1 requests per 10 seconds
/articles/get/{coreId}single10 requests per 10 seconds
/articles/get/{coreId}/download/pdfsingle10 requests per 10 seconds
/articles/get/{coreId}/historysingle10 requests per 10 seconds
/articles/searchbatch1 requests per 10 seconds
/articles/search/{query}single10 requests per 10 seconds
/articles/similarsingle10 requests per 10 seconds

Journal methods

MethodRequest typeLimit
/journals/getbatch1 requests per 10 seconds
/journals/get/{issn}single10 requests per 10 seconds
/journals/searchbatch2 requests per 10 seconds
/journals/search/{query}single5 requests per 10 seconds

Repository methods

MethodRequest typeLimit
/repositories/getbatch1 requests per 10 seconds
/repositories/get/{repositoryId}single10 requests per 10 seconds
/repositories/searchbatch2 requests per 10 seconds
/repositories/search/{query}single5 requests per 10 seconds

In case you require different limits please contact us.

CORE data as Linked Open Data (LOD)

Apart from the CORE API, CORE also provides data as LOD for enthusiasts. The documentation is available at the datahub. Please note the data is not synced regularly. We encourage all developers to use the CORE API v2.


CORE Dataset

The data aggregated from repositories by the CORE system can be accessed in two ways, through the CORE API or by downloading the data to your computer. The former option is practical if you want to build a service on top of CORE while the latter is something we recommend to those who would like to analyse the CORE dataset and possibly apply some computationally intensive batch processes.

Need to cite CORE? Visit our research page.

Avaliable datasets:

  • Dump 2016-10
    Metadata file (9.0 GB) (23,900,220 items)
    Content file (102 GB) (4,062,159 items)
  • Dump 2015-09
    Metadata file (4.5 GB)
    Content file (30.5 GB)
  • Dump 2014-06-13 (used for dataset track of DL2014)
    Metadata file (3.7 GB)
    Content file (24 GB)

Older versions:

Structure of the dumps

The CORE dataset provides access to both the enriched metadata as well as the full-texts. The data dump consists of two files, the metadata file and the content file. Both files are compressed using tar and gzip.

The structure of the metadata file is depicted in the diagram and an example of a metadata item in the data set is as follows:

        
{
    "identifier": 13291,
    "ep:Repository": 1,
    "dc:type": [
        "Report"
    ],
    "bibo:shortTitle": "Evaluating stillbirths : improving stillbirth data could help make stillbirths a visible public health priority",
    "bibo:AuthorList": [
        "IMMPACT",
        "Population Reference Bureau"
    ],
    "dc:date": "2007-02",
    "bibo:cites": [
        {
            "rawReferenceText": "Cynthia Stanton. Stillbirth Rates: Delivering Estimates",
            "authors": [

            ],
            "bibo:shortTitle": "Stillbirth Rates: Delivering Estimates",
            "doi": "10.1016/S0140-6736(06)68586-3"
        }
    ],
    "bibo:citedBy": [

    ],
    "similarities": [
        {
            "identifier": 29886,
            "sim:weight": 0.333121,
            "sim:AssociationMethod": "similarity_cosine"
        },
        {
            "identifier": 33044,
            "sim:weight": 0.325861,
            "sim:AssociationMethod": "similarity_cosine"
        },
        ...,
        {
            "identifier": 43755,
            "sim:weight": 0.173635,
            "sim:AssociationMethod": "similarity_cosine"
        }
    ]
}
        

The content file has the following structure:

{"identifier":612,"fullTextSource":"Here goes the fulltext ..."}

Disclaimer

This dataset has been created from information that was publicly available on the Internet. Every effort has been made to ensure this dataset contains open access content only. We have included content only from repositories and journals that are listed in registries where the condition for inclusion is the provision of content under an open access compatible license. However, as metadata are often inconsistent, license information is often not machine readable and, from time to time, repositories leak information that is not open access. We cannot take any responsibility for the license of the content in the dataset. It is therefore up to the user of this dataset to ensure that the way in which they use the dataset does not breach copyright. The dataset is in no way intended for the purposes of reading the original publications, but for machine processing only.


Dashboard

In an effort to improve the quality and transparency of the harvesting process of the open access content and create a two way collaboration between the CORE project and the providers of this content, CORE is introducing the Repositories Dashboard.

Access the dashboard now.

To register, please send us an email.

The aim of the Dashboard is to provide an online interface for repository providers and offer, through this online interface, valuable information to content providers about:

  • The content harvested from the repository enabling its management, such as by requesting metadata updates or managing take-down requests.
  • The times and frequency of content harvesting, including all detected technical issues and suggestions for improving the efficiency of harvesting and the quality of metadata, including compliance with existing metadata guidelines.
  • Statistics regarding the repository content, such as the distribution of content according to subject fields and types of research outputs, and the comparison of these with the national average.

Existing users can invite other users to use the CORE dashboard, see our simple guide.


CORE Recommender

The new version of the CORE recommender has now been released.

The recommender is a plugin that can be installed in repositories and journal systems to suggest similar articles. Its purpose is to support users in finding articles relevant to what they read.

The current version of the plugin recommends full-text items in Open Access repositories that are related to:

  • a metadata record
  • a full-text item in pdf
  • any piece of text
  • any combination of the above

The CORE Recommender is deployed in various locations, such as on the CORE Portal and in institutional repositories and journals. From these places, the recommender algorithm receives information as input - identifier, title, authors, abstract, year, source url, etc. - and enriches these attributes with additional available data, such as citation counts, number of downloads, whether the full-text is available in CORE, and more related information. All these form the set of features that are used to find similar documents in the CORE corpus.

Uniqueness of the CORE Recommender:

  • Our methods rely on the availability of full-texts.
  • We don’t base our recommendations solely on abstracts or metadata.
  • We ensure that the recommended articles are available open access.
  • We provide our recommendation service for free.
  • We provide it using a machine accessible interface (API).

Find out more about the CORE Recommender here. If you would like to test the closed BETA of the new recommender let us know.