170 research outputs found

    Incorporation of Personal Single Nucleotide Polymorphism (SNP) Data into a National Level Electronic Health Record for Disease Risk Assessment, Part 1: An Overview of Requirements

    Get PDF
    Background: Personalized medicine approaches provide opportunities for predictive and preventive medicine. Using genomic, clinical, environmental, and behavioral data, tracking and management of individual wellness is possible. A prolific way to carry this personalized approach into routine practices can be accomplished by integrating clinical interpretations of genomic variations into electronic medical records (EMRs)/electronic health records (EHRs). Today, various central EHR infrastructures have been constituted in many countries of the world including Turkey. Objective: The objective of this study was to concentrate on incorporating the personal single nucleotide polymorphism (SNP) data into the National Health Information System of Turkey (NHIS-T) for disease risk assessment, and evaluate the performance of various predictive models for prostate cancer cases. We present our work as a miniseries containing three parts: (1) an overview of requirements, (2) the incorporation of SNP into the NHIS-T, and (3) an evaluation of SNP incorporated NHIS-T for prostate cancer. Methods: For the first article of this miniseries, the scientific literature is reviewed and the requirements of SNP data integration into EMRs/EHRs are extracted and presented. Results: In the literature, basic requirements of genomic-enabled EMRs/EHRs are listed as incorporating genotype data and its clinical interpretation into EMRs/EHRs, developing accurate and accessible clinicogenomic interpretation resources (knowledge bases), interpreting and reinterpreting of variant data, and immersing of clinicogenomic information into the medical decision processes. In this section, we have analyzed these requirements under the subtitles of terminology standards, interoperability standards, clinicogenomic knowledge bases, defining clinical significance, and clinicogenomic decision support. Conclusions: In order to integrate structured genotype and phenotype data into any system, there is a need to determine data components, terminology standards, and identifiers of clinicogenomic information. Also, we need to determine interoperability standards to share information between different information systems of stakeholders, and develop decision support capability to interpret genomic variations based on the knowledge bases via different assessment approaches.Publisher's Versio

    BELB: a Biomedical Entity Linking Benchmark

    Full text link
    Biomedical entity linking (BEL) is the task of grounding entity mentions to a knowledge base. It plays a vital role in information extraction pipelines for the life sciences literature. We review recent work in the field and find that, as the task is absent from existing benchmarks for biomedical text mining, different studies adopt different experimental setups making comparisons based on published numbers problematic. Furthermore, neural systems are tested primarily on instances linked to the broad coverage knowledge base UMLS, leaving their performance to more specialized ones, e.g. genes or variants, understudied. We therefore developed BELB, a Biomedical Entity Linking Benchmark, providing access in a unified format to 11 corpora linked to 7 knowledge bases and spanning six entity types: gene, disease, chemical, species, cell line and variant. BELB greatly reduces preprocessing overhead in testing BEL systems on multiple corpora offering a standardized testbed for reproducible experiments. Using BELB we perform an extensive evaluation of six rule-based entity-specific systems and three recent neural approaches leveraging pre-trained language models. Our results reveal a mixed picture showing that neural approaches fail to perform consistently across entity types, highlighting the need of further studies towards entity-agnostic models

    Conceptual Modeling Applied to Genomics: Challenges Faced in Data Loading

    Full text link
    Todays genomic domain evolves around insecurity: too many imprecise concepts, too much information to be properly managed. Considering that conceptualization is the most exclusive human characteristic, it makes full sense to try to conceptualize the principles that guide the essence of why humans are as we are. This question can of course be generalized to any species, but we are especially interested in this work in showing how conceptual modeling is strictly required to understand the ''execution model'' that human beings ''implement''. The main issue is to defend the idea that only by having an in-depth knowledge of the Conceptual Model that is associated to the Human Genome, can this Human Genome properly be understood. This kind of Model-Driven perspective of the Human Genome opens challenging possibilities, by looking at the individuals as implementation of that Conceptual Model, where different values associated to different modeling primitives will explain the diversity among individuals and the potential, unexpected variations together with their unwanted effects in terms of illnesses. This work focuses on the challenges faced in loading data from conventional resources into Information Systems created according to the above mentioned conceptual modeling approach. The work reports on various loading efforts, problems encountered and the solutions to these problems. Also, a strong argument is made about why conventional methods to solve the so called `data chaosÂż problems associated to the genomics domain so often fail to meet the demands.Van Der Kroon ., M. (2011). Conceptual Modeling Applied to Genomics: Challenges Faced in Data Loading. http://hdl.handle.net/10251/16993Archivo delegad

    The ELIXIR Human Copy Number Variations Community:building bioinformatics infrastructure for research

    Get PDF
    Copy number variations (CNVs) are major causative contributors both in the genesis of genetic diseases and human neoplasias. While 'High-Throughput' sequencing technologies are increasingly becoming the primary choice for genomic screening analysis, their ability to efficiently detect CNVs is still heterogeneous and remains to be developed. The aim of this white paper is to provide a guiding framework for the future contributions of ELIXIR's recently established h uman CNV Community, with implications beyond human disease diagnostics and population genomics. This white paper is the direct result of a strategy meeting that took place in September 2018 in Hinxton (UK) and involved representatives of 11 ELIXIR Nodes. The meeting led to the definition of priority objectives and tasks, to address a wide range of CNV-related challenges ranging from detection and interpretation to sharing and training. Here, we provide suggestions on how to align these tasks within the ELIXIR Platforms strategy, and on how to frame the activities of this new ELIXIR Community in the international context

    Discovering lesser known molecular players and mechanistic patterns in Alzheimer's disease using an integrative disease modelling approach

    Get PDF
    Convergence of exponentially advancing technologies is driving medical research with life changing discoveries. On the contrary, repeated failures of high-profile drugs to battle Alzheimer's disease (AD) has made it one of the least successful therapeutic area. This failure pattern has provoked researchers to grapple with their beliefs about Alzheimer's aetiology. Thus, growing realisation that Amyloid-β and tau are not 'the' but rather 'one of the' factors necessitates the reassessment of pre-existing data to add new perspectives. To enable a holistic view of the disease, integrative modelling approaches are emerging as a powerful technique. Combining data at different scales and modes could considerably increase the predictive power of the integrative model by filling biological knowledge gaps. However, the reliability of the derived hypotheses largely depends on the completeness, quality, consistency, and context-specificity of the data. Thus, there is a need for agile methods and approaches that efficiently interrogate and utilise existing public data. This thesis presents the development of novel approaches and methods that address intrinsic issues of data integration and analysis in AD research. It aims to prioritise lesser-known AD candidates using highly curated and precise knowledge derived from integrated data. Here much of the emphasis is put on quality, reliability, and context-specificity. This thesis work showcases the benefit of integrating well-curated and disease-specific heterogeneous data in a semantic web-based framework for mining actionable knowledge. Furthermore, it introduces to the challenges encountered while harvesting information from literature and transcriptomic resources. State-of-the-art text-mining methodology is developed to extract miRNAs and its regulatory role in diseases and genes from the biomedical literature. To enable meta-analysis of biologically related transcriptomic data, a highly-curated metadata database has been developed, which explicates annotations specific to human and animal models. Finally, to corroborate common mechanistic patterns — embedded with novel candidates — across large-scale AD transcriptomic data, a new approach to generate gene regulatory networks has been developed. The work presented here has demonstrated its capability in identifying testable mechanistic hypotheses containing previously unknown or emerging knowledge from public data in two major publicly funded projects for Alzheimer's, Parkinson's and Epilepsy diseases

    Exploring genomic medicine using integrative biology

    Get PDF
    Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2004.Includes bibliographical references (p. 215-227).Instead of focusing on the cell, or the genotype, or on any single measurement modality, using integrative biology allows us to think holistically and horizontally. A disease like diabetes can lead to myocardial infarction, nephropathy, and neuropathy; to study diabetes in genomic medicine would require reasoning from a disease to all its various complications to the genome and back. I am studying the process of intersecting nearly-comprehensive data sets in molecular biology, across three representative modalities (microarrays, RNAi and quantitative trait loci) out of the more than 30 available today. This is difficult because the semantics and context of each experiment performed becomes more important, necessitating a detailed knowledge about the biological domain. I addressed this problem by using all public microarray data from NIH, unifying 50 million expression measurements with standard gene identifiers and representing the experimental context of each using the Unified Medical Language System, a vocabulary of over 1 million concepts. I created an automated system to join data sets related by experimental context.(cont.) I evaluated this system by finding genes significantly involved in multiple experiments directly and indirectly related to diabetes and adipogenesis and found genes known to be involved in these diseases and processes. As a model first step into integrative biology, I then took known quantitative trait loci in the rat involved in glucose metabolism and build an expert system to explain possible biological mechanisms for these genetic data using the modeled genomic data. The system I have created can link diseases from the ICD-9 billing code level down to the genetic, genomic, and molecular level. In a sense, this is the first automated system built to study the new field of genomic medicine.by Atul Janardhan Butte.Ph.D
    • …
    corecore