Article thumbnail
Location of Repository

OCC dataset of all the bibliographic entries, as of April 26, 2017

By OpenCitations ​ (3068259)


This archive contains the dump of the OpenCitation Corpus (OCC, dataset about bibliographic entries, created regularly every month.<br><br>After unzipping the archive, Disk ARchive (DAR,, a multi-platform archive tool for managing huge amount of data) is needed for recreating the whole structure. For extracting the DAR archive, please run the command<br><br>dar -x [archive-name]<br><br>Where "[archive-name"] is the name of the DAR file without final package number and extension. E.g.:<br><br>dar -x 2016-09-23-corpus_re<br><br>For further questions, comments, and suggestions please don't hesitate to contact Silvio Peroni at

Topics: Library and Information Studies, Semantic Publishing, OCC, OpenCitations
Year: 2017
DOI identifier: 10.6084/m9.figshare.4956305.v1
OAI identifier:
Provided by: FigShare
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)
  • (external link)
  •, (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.