27 research outputs found

    All associations as CSV

    No full text
    All ~204 million associations (explicit and implicit) listed as a CSV text file. Note that concept pairs are specified by concept ID and its first label according to our thesaurus

    Pre-compiled software dependencies

    No full text
    Collection of JAR files containing the dependencies for generating concept profiles and match scores. Source code is also available in our github repository

    Overview of software packages

    No full text
    Software packages used for the various analyses and figures. Links are clickable and point to the repository for that code or data

    Medline concept index

    No full text
    Concept index of Medline articles as specified in http://dx.doi.org/10.5061/dryad.gn219/

    All associations as nanopublications

    No full text
    The complete set of all ~204 million associations (explicit and implicit) as nanopublications. Each nanopublication asserts an association between a gene and a disease concept and the percentile rank of the match score

    PMID list

    No full text
    list of PMIDs used to create concept profiles; gzipped file, one PMID per lin

    New suite of Concept Profile Analysis Web Services

    No full text
    <p>This zipfile contains the source code for the Concept Profile Mining Web services.<br> The directory 'erasmusmc_maven_dependencies' contains copies of the erasmus-mc jar files that are hosted at the DTL nexus repository.</p> <p> </p

    Genome Annotation using Nanopublications: An Approach to Interoperability of Genetic Data

    No full text
    <p>With the wide spread use of Next Generation Sequencing (NGS) technologies, the primary bottleneck of genetic research has shifted from data production to data analysis. However, annotated datasets produced by different research groups are often in different formats, making genetic comparisons and integration with other datasets challenging and time consuming tasks. Here, we propose a new data interoperability approach that provides unambiguous (machine readable) description of genomic annotations based on a novel method of data publishing called nanopublication. A nanopublication is a schema built on top of existing semantic web technologies that consists of three components: an individual assertion (i.e., the genomic annotation); provenance (containing links to the experimental information and data processing steps); and publication info (information about data ownership and rights, allowing each genomic annotation to be citable and its scientific impact tracked ). We use nanopublications to demonstrate automatic interoperability between individual genomic annotations from the FANTOM5 consortium (transcription start sites) and the Leiden Open Variation Database (genetic variants). The nanopublications can also be integrated with the data of the other semantic web frameworks like COEUS. Exposing legacy information and new NGS data as nanopublications promises tremendous scaling advantages when integrating very large and heterogeneous genetic datasets.</p
    corecore