Search CORE

14 research outputs found

Distributions of number of and average number of accession numbers cited per article over time.

Author: Jee-Hyub Kim (416312)
Johanna R. McEntyre (416313)
Şenay Kafkas (416311)
Publication venue
Publication date
Field of study

The graphs show the number of (a) and the average number of (b) ENA, PDB and UniProt accession numbers cited per article according to publication year (in the OA-ePMC). Data from 2012 is excluded as it is not a complete year. In <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0063184#pone-0063184-g005" target="_blank">Figure 5(b)</a>, for a given year and database, average value is calculated by using articles containing accession number citations only. Text-mined results are used together with the publisher-annotated data to generate the graphs.</p

FigShare

Venn diagrams showing the spread of accession numbers supplied in the article XML and annotated by the Whatizit ANA pipeline.

Author: Jee-Hyub Kim (416312)
Johanna R. McEntyre (416313)
Şenay Kafkas (416311)
Publication venue
Publication date
Field of study

(a) ENA (total: 160,112) (b) PDB (total: 39,972) (c) UniProt (total: 9,430) (d) all (total: 209,519). The results show that text mining substantially increases the number of accession numbers identified.</p

FigShare

Extraction patterns and contextual cues for databases.

Author: Jee-Hyub Kim (416312)
Johanna R. McEntyre (416313)
Şenay Kafkas (416311)
Publication venue
Publication date
Field of study

Patterns are separated by the “;” sign.</p

FigShare

Database citations in articles relative to database size.

Author: Jee-Hyub Kim (416312)
Johanna R. McEntyre (416313)
Şenay Kafkas (416311)
Publication venue
Publication date
Field of study

*Total number of Annotations = Publisher-annotated + text-mined.**This is the number of records in the curated component of UniProt.</p

FigShare

Comparison between article-to-database and database-to-citations.

Author: Jee-Hyub Kim (416312)
Johanna R. McEntyre (416313)
Şenay Kafkas (416311)
Publication venue
Publication date
Field of study

Venn diagrams show the overlapping article-to-database and database-to-article citations. (a) ENA (b) PDB (c) UniProt (d) all databases. Notable is that in the cases for ENA and PDB, database citations from the literature significantly enrich the database-literature crosslinks supplied from databases. For UniProt, the citations from the database to the literature dwarf the converse citations, mainly due to the fact that, for certain proteomes, many thousands of UniProt records can link to a single article. Text-mined results are used together with the publisher-annotated data to generate the venn diagrams.</p

FigShare

The reciprocal citation relationships between articles and database records.

Author: Jee-Hyub Kim (416312)
Johanna R. McEntyre (416313)
Şenay Kafkas (416311)
Publication venue
Publication date
Field of study

The reciprocal citation relationships between articles and database records.</p

FigShare

Distribution of number of articles according to years in the OA-ePMC set.

Author: Jee-Hyub Kim (416312)
Johanna R. McEntyre (416313)
Şenay Kafkas (416311)
Publication venue
Publication date
Field of study

This figure shows the distribution of articles by publication year in the OA-ePMC set. Note the apparent decrease in OA articles available in 2012 is due to an incomplete year (dataset was frozen for this study in June 2012).</p

FigShare

Accession numbers in supplementary files

Author: Jee-Hyub Kim (416312)
Johanna R. McEntyre (416313)
Xingjun Pi (5267536)
Şenay Kafkas (416311)
Publication venue
Publication date
Field of study

This dataset contains database accession numbers which are automatically extracted from the supplementary files linked to open access full text biomedical articles.</p

FigShare

Contributions and roles related to content as they correspond to identifier creation versus identifier reuse.

Author: Alan R. Williams (4182421)
Alejandra Gonzalez-Beltran (1386885)
Camille Laibe (222845)
Carole Goble (19490)
Chris Morris (4182439)
Christopher J. Mungall (256879)
Donal K. Fellows (4182427)
Helen Parkinson (681796)
Henning Hermjakob (14241)
Jacky L. Snoep (14073)
James Malone (2634373)
Janna Hastings (342774)
Jean-Karim Hériché (14170)
Jeffrey Grethe (824786)
Johanna R. McEntyre (416313)
John Deck (529666)
John Kunze (4182451)
Jon C. Ison (4182430)
Juha Muilu (4182454)
Julie A. McMurry (4182448)
Katherine Wolstencroft (1386969)
Lilly M. Winfree (4182436)
Maria Jesus Martin (4182445)
Melissa A. Haendel (256876)
Michel Dumontier (27895)
Murat Sariyar (4182457)
Mélanie Courtot (14066)
Natalie J. Stanford (485343)
Nathalie Conte (3175653)
Neil Swainston (46779)
Nick Juty (4182460)
Nicolas Le Novère (14063)
Nicole Washington (4182442)
Niklas Blomberg (1386942)
Philipp Gormanns (184381)
Philippe Rocca-Serra (18677)
Rafael C. Jimenez (106993)
Sarala M. Wimalaratne (4182433)
Simon Jupp (3212565)
Stian Soiland-Reyes (80636)
Susanna-Assunta Sansone (15155)
Tom Conlin (14042)
Tony Burdett (4182424)
Wolfgang Müller (420131)
Publication venue
Publication date
Field of study

The decision about whether to create a new identifier or reuse an existing one depends on the role you play in the creation, editing, and republishing of content; for certain roles (and when several roles apply) that decision is a judgement call. Asterisks convey cases in which the best course of action is often to correct/improve the original record in collaboration with the original source; the guidance about identifier creation versus reuse is meant to apply only when such collaboration is not practicable (and an alternate record is created). It is common that a given actor may have multiple roles along this spectrum; for instance, a given record in monarchinitiative.org may reflect a combination of (a) corrections Monarch staff made in collaboration with the original data source, (b) post-ingest curation by Monarch staff, (c) expanded content integrated from multiple sources.</p

FigShare

Record-level versioning and release-level versioning.

Author: Alan R. Williams (4182421)
Alejandra Gonzalez-Beltran (1386885)
Camille Laibe (222845)
Carole Goble (19490)
Chris Morris (4182439)
Christopher J. Mungall (256879)
Donal K. Fellows (4182427)
Helen Parkinson (681796)
Henning Hermjakob (14241)
Jacky L. Snoep (14073)
James Malone (2634373)
Janna Hastings (342774)
Jean-Karim Hériché (14170)
Jeffrey Grethe (824786)
Johanna R. McEntyre (416313)
John Deck (529666)
John Kunze (4182451)
Jon C. Ison (4182430)
Juha Muilu (4182454)
Julie A. McMurry (4182448)
Katherine Wolstencroft (1386969)
Lilly M. Winfree (4182436)
Maria Jesus Martin (4182445)
Melissa A. Haendel (256876)
Michel Dumontier (27895)
Murat Sariyar (4182457)
Mélanie Courtot (14066)
Natalie J. Stanford (485343)
Nathalie Conte (3175653)
Neil Swainston (46779)
Nick Juty (4182460)
Nicolas Le Novère (14063)
Nicole Washington (4182442)
Niklas Blomberg (1386942)
Philipp Gormanns (184381)
Philippe Rocca-Serra (18677)
Rafael C. Jimenez (106993)
Sarala M. Wimalaratne (4182433)
Simon Jupp (3212565)
Stian Soiland-Reyes (80636)
Susanna-Assunta Sansone (15155)
Tom Conlin (14042)
Tony Burdett (4182424)
Wolfgang Müller (420131)
Publication venue
Publication date
Field of study

Record-level versioning and release-level versioning.</p

FigShare