52 research outputs found

    Using ontologies to map between research data and policymakers’ presumptions: the experience of the KNOWMAK project

    Get PDF
    Understanding knowledge co-creation in key emerging areas of European research is critical for policy makers wishing to analyze impact and make strategic decisions. However, purely data-driven methods for characterising policy topics have limitations relating to the broad nature of such topics and the differences in language and topic structure between the political language and scientific and technological outputs. In this paper, we discuss the use of ontologies and semantic technologies as a means to bridge the linguistic and conceptual gap between policy questions and data sources for characterising European knowledge production. Our experience suggests that the integration between advanced techniques for language processing and expert assessment at critical junctures in the process is key for the success of this endeavour

    'Open the Pod Bay Doors, Please, HAL': Here Comes the Semantic Web

    Get PDF
    Different kinds of knowledge management systems have been adopted by institutions in an attempt to tame the tsunami of unrelated facts, hard data, research and stray morsels of knowledge that abound in any moderately-sized organisation. Yet each of these islands of coherence, however effective, is just that - an island - and, as such, cannot provide a generalised way forward to making data usable. The Semantic Web is an attempt to solve this problem. In the context of the project, 'semantic' simply stands for 'machine-processable'. If information can be made comprehensible to machines such as computers, these machines can then do all the hard work of sorting and sifting and weighing up that is currently done (very imperfectly) by humans, and, because they are computers, they can do it more quickly and on an unimaginably huge scale. In addition, they can learn on the job so that they can make an even better fist of it the next time around, and better again the time after that

    Open Data Sectors and Communities: Environment

    Get PDF
    Chapter 7 in the book The State of Open Data: Histories and Horizons

    Information Outlook, July/August 2018

    Get PDF
    Volume 22, Issue 4https://scholarworks.sjsu.edu/sla_io_2018/1003/thumbnail.jp

    The health care and life sciences community profile for dataset descriptions

    Get PDF
    Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. However, while there are many relevant vocabularies for the annotation of a dataset, none sufficiently captures all the necessary metadata. This prevents uniform indexing and querying of dataset repositories. Towards providing a practical guide for producing a high quality description of biomedical datasets, the W3C Semantic Web for Health Care and the Life Sciences Interest Group (HCLSIG) identified Resource Description Framework (RDF) vocabularies that could be used to specify common metadata elements and their value sets. The resulting guideline covers elements of description, identification, attribution, versioning, provenance, and content summarization. This guideline reuses existing vocabularies, and is intended to meet key functional requirements including indexing, discovery, exchange, query, and retrieval of datasets, thereby enabling the publication of FAIR data. The resulting metadata profile is generic and could be used by other domains with an interest in providing machine readable descriptions of versioned datasets

    Transcript expression-aware annotation improves rare variant interpretation

    Get PDF
    The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)(1), we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the 'proportion expressed across transcripts', which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project(2) and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.Peer reviewe

    Methodology and System for Ontology-Enabled Traceability: Pilot Application to Design and Management of the Washington D.C. Metro System

    Get PDF
    This report describes a new methodology and system for satisfying requirements, and an architectural framework for linking discipline-specific dependencies through interaction relationships at the meta-model (or ontology) level. In state-of-the-art traceability mechanisms, requirements are connected directly to design objects. Here, in contrast, we ask the question: What design concept (or family of design concepts) should be applied to satisfy this requirement? Solutions to this question establish links between requirements and design concepts. Then, it is the implementation of these concepts that leads to the design itself. These ideas are prototyped through a Washington DC Metro System requirements-to-design model mockup. The proposed methodology offers several benefits not possible with state-of-the-art procedures. First, procedures for design rule checking may be embedded into design concept nodes, thereby creating a pathway for system validation and verification processes that can be executed early in the systems lifecycle where errors are cheapest and easiest to fix. Second, the proposed model provides a much better big-picture view of relevant design concepts and how they fit together, than is possible with linking of domains at the model level. And finally, the proposed procedures are automatically reusable across families of projects where the ontologies are applicable
    corecore