184 research outputs found

    Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data

    Get PDF
    To make digital resources on the web verifiable, immutable, and permanent, we propose a technique to include cryptographic hash values in URIs. We call them trusty URIs and we show how they can be used for approaches like nanopublications to make not only specific resources but their entire reference trees verifiable. Digital artifacts can be identified not only on the byte level but on more abstract levels such as RDF graphs, which means that resources keep their hash values even when presented in a different format. Our approach sticks to the core principles of the web, namely openness and decentralized architecture, is fully compatible with existing standards and protocols, and can therefore be used right away. Evaluation of our reference implementations shows that these desired properties are indeed accomplished by our approach, and that it remains practical even for very large files.Comment: Small error corrected in the text (table data was correct) on page 13: "All average values are below 0.8s (0.03s for batch mode). Using Java in batch mode even requires only 1ms per file.

    Germline copy number variation in the YTHDC2 gene: does it have a role in finding a novel potential molecular target involved in pancreatic adenocarcinoma susceptibility?

    Get PDF
    Objective: The vast majority of pancreatic cancers occurs sporadically. The discovery of frequent variations in germline gene copy number can significantly influence the expression levels of genes that predispose to pancreatic adenocarcinoma. We prospectively investigated whether patients with sporadic pancreatic adenocarcinoma share specific gene copy number variations (CNVs) in their germline DNA. Patients and methods: DNA samples were analyzed from peripheral leukocytes from 72 patients with a diagnosis of sporadic pancreatic adenocarcinoma and from 60 controls using Affymetrix 500K array set. Multiplex ligation-dependent probe amplification (MLPA) assay was performed using a set of self-designed MLPA probes specific for seven target sequences. Results: We identified a CNV-containing DNA region associated with pancreatic cancer risk. This region shows a deletion of 1 allele in 36 of the 72 analyzed patients but in none of the controls. This region is of particular interest since it contains the YTHDC2 gene encoding for a putative DNA/RNA helicase, such protein being frequently involved in cancer susceptibility. Interestingly, 82.6% of Sicilian patients showed germline loss of one allele. Conclusions: Our results suggest that the YTHDC2 gene could be a potential candidate for pancreatic cancer susceptibility and a useful marker for early detection as well as for the development of possible new therapeutic strategies

    Reproducibility, bioinformatic analysis and power of the SAGE method to evaluate changes in transcriptome

    Get PDF
    The serial analysis of gene expression (SAGE) method is used to study global gene expression in cells or tissues in various experimental conditions. However, its reproducibility has not yet been definitively assessed. In this study, we have evaluated the reproducibility of the SAGE method and identified the factors that affect it. The determination coefficient (R(2)) for the reproducibility of SAGE is 0.96. However, there are some factors that can affect the reproducibility of SAGE, such as the replication of concatemers and ditags, the number of sequenced tags and double PCR amplification of ditags. Thus, corrections for these factors must be made to ensure the reproducibility and accuracy of SAGE results. A bioinformatic analysis of SAGE data is also presented in order to eliminate these artifacts. Finally, the current study shows that increasing the number of sequenced tags improves the power of the method to detect transcripts and their regulation by experimental conditions

    AI-KG: an Automatically Generated Knowledge Graph of Artificial Intelligence

    Get PDF
    Scientific knowledge has been traditionally disseminated and preserved through research articles published in journals, conference proceedings, and online archives. However, this article-centric paradigm has been often criticized for not allowing to automatically process, categorize, and reason on this knowledge. An alternative vision is to generate a semantically rich and interlinked description of the content of research publications. In this paper, we present the Artificial Intelligence Knowledge Graph (AI-KG), a large-scale automatically generated knowledge graph that describes 820K research entities. AI-KG includes about 14M RDF triples and 1.2M reified statements extracted from 333K research publications in the field of AI, and describes 5 types of entities (tasks, methods, metrics, materials, others) linked by 27 relations. AI-KG has been designed to support a variety of intelligent services for analyzing and making sense of research dynamics, supporting researchers in their daily job, and helping to inform decision-making in funding bodies and research policymakers. AI-KG has been generated by applying an automatic pipeline that extracts entities and relationships using three tools:DyGIE++, Stanford CoreNLP, and the CSO Classifier. It then integrates and filters the resulting triples using a combination of deep learning and semantic technologies in order to produce a high-quality knowledge graph. This pipeline was evaluated on a manually crafted gold standard, yielding competitive results. AI-KG is available under CC BY 4.0 and can be downloaded as a dump or queried via a SPARQL endpoint

    A type 2 diabetes disease module with a high collective influence for Cdk2 and PTPLAD1 is localized in endosomes

    Get PDF
    Despite the identification of many susceptibility genes our knowledge of the underlying mechanisms responsible for complex disease remains limited. Here, we identified a type 2 diabetes disease module in endosomes, and validate it for functional relevance on selected nodes. Using hepatic Golgi/endosomes fractions, we established a proteome of insulin receptor-containing endosomes that allowed the study of physical protein interaction networks on a type 2 diabetes background. The resulting collated network is formed by 313 nodes and 1147 edges with a topology organized around a few major hubs with Cdk2 displaying the highest collective influence. Overall, 88% of the nodes are associated with the type 2 diabetes genetic risk, including 101 new candidates. The Type 2 diabetes module is enriched with cytoskeleton and luminal acidification-dependent processes that are shared with secretion-related mechanisms. We identified new signaling pathways driven by Cdk2 and PTPLAD1 whose expression affects the association of the insulin receptor with TUBA, TUBB, the actin component ACTB and the endosomal sorting markers Rab5c and Rab11a. Therefore, the interactome of internalized insulin receptors reveals the presence of a type 2 diabetes disease module enriched in new layers of feedback loops required for insulin signaling, clearance and islet biology

    Semantic Web integration of Cheminformatics resources with the SADI framework

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The diversity and the largely independent nature of chemical research efforts over the past half century are, most likely, the major contributors to the current poor state of chemical computational resource and database interoperability. While open software for chemical format interconversion and database entry cross-linking have partially addressed database interoperability, computational resource integration is hindered by the great diversity of software interfaces, languages, access methods, and platforms, among others. This has, in turn, translated into limited reproducibility of computational experiments and the need for application-specific computational workflow construction and semi-automated enactment by human experts, especially where emerging interdisciplinary fields, such as systems chemistry, are pursued. Fortunately, the advent of the Semantic Web, and the very recent introduction of RESTful Semantic Web Services (SWS) may present an opportunity to integrate all of the existing computational and database resources in chemistry into a machine-understandable, unified system that draws on the entirety of the Semantic Web.</p> <p>Results</p> <p>We have created a prototype framework of Semantic Automated Discovery and Integration (SADI) framework SWS that exposes the QSAR descriptor functionality of the Chemistry Development Kit. Since each of these services has formal ontology-defined input and output classes, and each service consumes and produces RDF graphs, clients can automatically reason about the services and available reference information necessary to complete a given overall computational task specified through a simple SPARQL query. We demonstrate this capability by carrying out QSAR analysis backed by a simple formal ontology to determine whether a given molecule is drug-like. Further, we discuss parameter-based control over the execution of SADI SWS. Finally, we demonstrate the value of computational resource envelopment as SADI services through service reuse and ease of integration of computational functionality into formal ontologies.</p> <p>Conclusions</p> <p>The work we present here may trigger a major paradigm shift in the distribution of computational resources in chemistry. We conclude that envelopment of chemical computational resources as SADI SWS facilitates interdisciplinary research by enabling the definition of computational problems in terms of ontologies and formal logical statements instead of cumbersome and application-specific tasks and workflows.</p
    • …
    corecore