72 research outputs found

    PQL: A Declarative Query Language over Dynamic Biological Schemata

    Get PDF
    We introduce the PQL query language (PQL) used in the GeneSeek genetic data integration project. PQL incorporates many features of query languages for semi-structured data. To this we add the ability to express metadata constraints like intended semantics and database curation approach. These constraints guide the dynamic generation of potential query plans. This allows a single query to remain relevant even in the presence of source and mediated schemas that are continually evolving, as is often the case in data integration

    Concept Mapping to Develop a Framework for Characterizing Electronic Data Capture (EDC) Systems

    Get PDF
    CTSAs have brought about a push to find better EDC systems, which facilitate translational research. Based on the data management needs of a specific clinical/translational research lab, concept mapping was used to create a framework to evaluate EDCs. After refinement based on a spiral model, including consultations with the UW CTSA and a survey of other CTSAs, the tool was used to characterize EDCs used at CTSA sites across the country

    Integrating and Ranking Uncertain Scientific Data

    Get PDF
    Mediator-based data integration systems resolve exploratory queries by joining data elements across sources. In the presence of uncertainties, such multiple expansions can quickly lead to spurious connections and incorrect results. The BioRank project investigates formalisms for modeling uncertainty during scientific data integration and for ranking uncertain query results. Our motivating application is protein function prediction. In this paper we show that: (i) explicit modeling of uncertainties as probabilities increases our ability to predict less-known or previously unknown functions (though it does not improve predicting the well-known). This suggests that probabilistic uncertainty models offer utility for scientific knowledge discovery; (ii) small perturbations in the input probabilities tend to produce only minor changes in the quality of our result rankings. This suggests that our methods are robust against slight variations in the way uncertainties are transformed into probabilities; and (iii) several techniques allow us to evaluate our probabilistic rankings efficiently. This suggests that probabilistic query evaluation is not as hard for real-world problems as theory indicates

    The potential for automated question answering in the context of genomic medicine: an assessment of existing resources and properties of answers

    Get PDF
    Knowledge gained in studies of genetic disorders is reported in a growing body of biomedical literature containing reports of genetic variation in individuals that map to medical conditions and/or response to therapy. These scientific discoveries need to be translated into practical applications to optimize patient care. Translating research into practice can be facilitated by supplying clinicians with research evidence. We assessed the role of existing tools in extracting answers to translational research questions in the area of genomic medicine. We: evaluate the coverage of translational research terms in the Unified Medical Language Systems (UMLS) Metathesaurus; determine where answers are most often found in full-text articles; and determine common answer patterns. Findings suggest that we will be able to leverage the UMLS in development of natural language processing algorithms for automated extraction of answers to translational research questions from biomedical text in the area of genomic medicine

    Translational bioinformatics: linking knowledge across biological and clinical realms

    Get PDF
    Nearly a decade since the completion of the first draft of the human genome, the biomedical community is positioned to usher in a new era of scientific inquiry that links fundamental biological insights with clinical knowledge. Accordingly, holistic approaches are needed to develop and assess hypotheses that incorporate genotypic, phenotypic, and environmental knowledge. This perspective presents translational bioinformatics as a discipline that builds on the successes of bioinformatics and health informatics for the study of complex diseases. The early successes of translational bioinformatics are indicative of the potential to achieve the promise of the Human Genome Project for gaining deeper insights to the genetic underpinnings of disease and progress toward the development of a new generation of therapies

    On the persistence of supplementary resources in biomedical publications

    Get PDF
    BACKGROUND: Providing for long-term and consistent public access to scientific data is a growing concern in biomedical research. One aspect of this problem can be demonstrated by evaluating the persistence of supplementary data associated with published biomedical papers. METHODS: We manually evaluated 655 supplementary data links extracted from PubMed abstracts published 1998–2005 (Method 1) as well as a further focused subset of 162 full-text manuscripts published within three representative high-impact biomedical journals between September and December 2004 (Method 2). RESULTS: For Method 1 we found that since 2001, only 71 – 92% of supplementary data were still accessible via the links provided, with 93% of these inaccessible links occurring where supplementary data was not stored with the publishing journal. Of the manuscripts evaluated in Method 2, we found that only 83% of these links were available approximately a year after publication, with 55% of these inaccessible links were at locations outside the journal of publication. CONCLUSION: We conclude that if supplemental data is required to support the publication, journals policies must take-on the responsibility to accept and store such data or require that it be maintained with a credible independent institution or under the terms of a strategic data storage plan specified by the authors. We further recommend that publishers provide automated systems to ensure that supplementary links remain persistent, and that granting bodies such as the NIH develop policies and funding mechanisms to maintain long-term persistent access to these data

    Developing a Prototype System for Integrating Pharmacogenomics Findings into Clinical Practice

    Get PDF
    Findings from pharmacogenomics (PGx) studies have the potential to be applied to individualize drug therapy to improve efficacy and reduce adverse drug events. Researchers have identified factors influencing uptake of genomics in medicine, but little is known about the specific technical barriers to incorporating PGx into existing clinical frameworks. We present the design and development of a prototype PGx clinical decision support (CDS) system that builds on existing clinical infrastructure and incorporates semi-active and active CDS. Informing this work, we updated previous evaluations of PGx knowledge characteristics, and of how the CDS capabilities of three local clinical systems align with data and functional requirements for PGx CDS. We summarize characteristics of PGx knowledge and technical needs for implementing PGx CDS within existing clinical frameworks. PGx decision support rules derived from FDA drug labels primarily involve drug metabolizing genes, vary in maturity, and the majority support the post-analytic phase of genetic testing. Computerized provider order entry capabilities are key functional requirements for PGx CDS and were best supported by one of the three systems we evaluated. We identified two technical needs when building on this system, the need for (1) new or existing standards for data exchange to connect clinical data to PGx knowledge, and (2) a method for implementing semi-active CDS. Our analyses enhance our understanding of principles for designing and implementing CDS for drug therapy individualization and our current understanding of PGx characteristics in a clinical context. Characteristics of PGx knowledge and capabilities of current clinical systems can help govern decisions about CDS implementation, and can help guide decisions made by groups that develop and maintain knowledge resources such that delivery of content for clinical care is supported

    Feasibility of incorporating genomic knowledge into electronic medical records for pharmacogenomic clinical decision support

    Get PDF
    In pursuing personalized medicine, pharmacogenomic (PGx) knowledge may help guide prescribing drugs based on a person’s genotype. Here we evaluate the feasibility of incorporating PGx knowledge, combined with clinical data, to support clinical decision-making by: 1) analyzing clinically relevant knowledge contained in PGx knowledge resources; 2) evaluating the feasibility of a rule-based framework to support formal representation of clinically relevant knowledge contained in PGx knowledge resources; and, 3) evaluating the ability of an electronic medical record/electronic health record (EMR/EHR) to provide computable forms of clinical data needed for PGx clinical decision support. Findings suggest that the PharmGKB is a good source for PGx knowledge to supplement information contained in FDA approved drug labels. Furthermore, we found that with supporting knowledge (e.g. IF age <18 THEN patient is a child), sufficient clinical data exists in University of Washington’s EMR systems to support 50% of PGx knowledge contained in drug labels that could be expressed as rules
    corecore