6,990 research outputs found

    Ontology-based knowledge representation of experiment metadata in biological data mining

    Get PDF
    According to the PubMed resource from the U.S. National Library of Medicine, over 750,000 scientific articles have been published in the ~5000 biomedical journals worldwide in the year 2007 alone. The vast majority of these publications include results from hypothesis-driven experimentation in overlapping biomedical research domains. Unfortunately, the sheer volume of information being generated by the biomedical research enterprise has made it virtually impossible for investigators to stay aware of the latest findings in their domain of interest, let alone to be able to assimilate and mine data from related investigations for purposes of meta-analysis. While computers have the potential for assisting investigators in the extraction, management and analysis of these data, information contained in the traditional journal publication is still largely unstructured, free-text descriptions of study design, experimental application and results interpretation, making it difficult for computers to gain access to the content of what is being conveyed without significant manual intervention. In order to circumvent these roadblocks and make the most of the output from the biomedical research enterprise, a variety of related standards in knowledge representation are being developed, proposed and adopted in the biomedical community. In this chapter, we will explore the current status of efforts to develop minimum information standards for the representation of a biomedical experiment, ontologies composed of shared vocabularies assembled into subsumption hierarchical structures, and extensible relational data models that link the information components together in a machine-readable and human-useable framework for data mining purposes

    A semi-automatic semantic method for mapping SNOMED CT concepts to VCM Icons

    Full text link
    VCM (Visualization of Concept in Medicine) is an iconic language for representing key medical concepts by icons. However, the use of this language with reference terminologies, such as SNOMED CT, will require the mapping of its icons to the terms of these terminologies. Here, we present and evaluate a semi-automatic semantic method for the mapping of SNOMED CT concepts to VCM icons. Both SNOMED CT and VCM are compositional in nature; SNOMED CT is expressed in description logic and VCM semantics are formalized in an OWL ontology. The proposed method involves the manual mapping of a limited number of underlying concepts from the VCM ontology, followed by automatic generation of the rest of the mapping. We applied this method to the clinical findings of the SNOMED CT CORE subset, and 100 randomly-selected mappings were evaluated by three experts. The results obtained were promising, with 82 of the SNOMED CT concepts correctly linked to VCM icons according to the experts. Most of the errors were easy to fix

    Infectious Disease Ontology

    Get PDF
    Technological developments have resulted in tremendous increases in the volume and diversity of the data and information that must be processed in the course of biomedical and clinical research and practice. Researchers are at the same time under ever greater pressure to share data and to take steps to ensure that data resources are interoperable. The use of ontologies to annotate data has proven successful in supporting these goals and in providing new possibilities for the automated processing of data and information. In this chapter, we describe different types of vocabulary resources and emphasize those features of formal ontologies that make them most useful for computational applications. We describe current uses of ontologies and discuss future goals for ontology-based computing, focusing on its use in the field of infectious diseases. We review the largest and most widely used vocabulary resources relevant to the study of infectious diseases and conclude with a description of the Infectious Disease Ontology (IDO) suite of interoperable ontology modules that together cover the entire infectious disease domain

    A proposal for a coordinated effort for the determination of brainwide neuroanatomical connectivity in model organisms at a mesoscopic scale

    Get PDF
    In this era of complete genomes, our knowledge of neuroanatomical circuitry remains surprisingly sparse. Such knowledge is however critical both for basic and clinical research into brain function. Here we advocate for a concerted effort to fill this gap, through systematic, experimental mapping of neural circuits at a mesoscopic scale of resolution suitable for comprehensive, brain-wide coverage, using injections of tracers or viral vectors. We detail the scientific and medical rationale and briefly review existing knowledge and experimental techniques. We define a set of desiderata, including brain-wide coverage; validated and extensible experimental techniques suitable for standardization and automation; centralized, open access data repository; compatibility with existing resources, and tractability with current informatics technology. We discuss a hypothetical but tractable plan for mouse, additional efforts for the macaque, and technique development for human. We estimate that the mouse connectivity project could be completed within five years with a comparatively modest budget.Comment: 41 page

    The Infectious Disease Ontology in the Age of COVID-19

    Get PDF
    The Infectious Disease Ontology (IDO) is a suite of interoperable ontology modules that aims to provide coverage of all aspects of the infectious disease domain, including biomedical research, clinical care, and public health. IDO Core is designed to be a disease and pathogen neutral ontology, covering just those types of entities and relations that are relevant to infectious diseases generally. IDO Core is then extended by a collection of ontology modules focusing on specific diseases and pathogens. In this paper we present applications of IDO Core within various areas of infectious disease research, together with an overview of all IDO extension ontologies and the methodology on the basis of which they are built. We also survey recent developments involving IDO, including the creation of IDO Virus; the Coronaviruses Infectious Disease Ontology (CIDO); and an extension of CIDO focused on COVID-19 (IDO-CovID-19).We also discuss how these ontologies might assist in information-driven efforts to deal with the ongoing COVID-19 pandemic, to accelerate data discovery in the early stages of future pandemics, and to promote reproducibility of infectious disease research

    Natural Language Query in the Biochemistry and Molecular Biology Domains Based on Cognition Search™

    Get PDF
    Motivation: With the tremendous growth in scientific literature, it is necessary to improve upon the standard pattern matching style of the available search engines. Semantic NLP may be the solution to this problem. Cognition Search (CSIR) is a natural language technology. It is best used by asking a simple question that might be answered in textual data being queried, such as MEDLINE. CSIR has a large English dictionary and semantic database. Cognition’s semantic map enables the search process to be based on meaning rather than statistical word pattern matching and, therefore, returns more complete and relevant results. The Cognition Search engine uses downward reasoning and synonymy which also improves recall. It improves precision through phrase parsing and word sense disambiguation.
Result: Here we have carried out several projects to "teach" the CSIR lexicon medical, biochemical and molecular biological language and acronyms from curated web-based free sources. Vocabulary from the Alliance for Cell Signaling (AfCS), the Human Genome Nomenclature Consortium (HGNC), the United Medical Language System (UMLS) Meta-thesaurus, and The International Union of Pure and Applied Chemistry (IUPAC) was introduced into the CSIR dictionary and curated. The resulting system was used to interpret MEDLINE abstracts. Meaning-based search of MEDLINE abstracts yields high precision (estimated at >90%), and high recall (estimated at >90%), where synonym information has been encoded. The present implementation can be found at http://MEDLINE.cognition.com. 
&#xa

    Revising the UMLS Semantic Network

    Get PDF
    The integration of standardized biomedical terminologies into a single, unified knowledge representation system has formed a key area of applied informatics research in recent years. The Unified Medical Language System (UMLS) is the most advanced and most prominent effort in this direction, bringing together within its Metathesaurus a large number of distinct source-terminologies. The UMLS Semantic Network, which is designed to support the integration of these source-terminologies, has proved to be a highly successful combination of formal coherence and broad scope. We argue here, however, that its organization manifests certain structural problems, and we describe revisions which we believe are needed if the network is to be maximally successful in realizing its goals of supporting terminology integration
    • …
    corecore