8 research outputs found

    The potential for automated question answering in the context of genomic medicine: an assessment of existing resources and properties of answers

    Get PDF
    Knowledge gained in studies of genetic disorders is reported in a growing body of biomedical literature containing reports of genetic variation in individuals that map to medical conditions and/or response to therapy. These scientific discoveries need to be translated into practical applications to optimize patient care. Translating research into practice can be facilitated by supplying clinicians with research evidence. We assessed the role of existing tools in extracting answers to translational research questions in the area of genomic medicine. We: evaluate the coverage of translational research terms in the Unified Medical Language Systems (UMLS) Metathesaurus; determine where answers are most often found in full-text articles; and determine common answer patterns. Findings suggest that we will be able to leverage the UMLS in development of natural language processing algorithms for automated extraction of answers to translational research questions from biomedical text in the area of genomic medicine

    Doctor of Philosophy

    Get PDF
    dissertationPublic health surveillance systems are crucial for the timely detection and response to public health threats. Since the terrorist attacks of September 11, 2001, and the release of anthrax in the following month, there has been a heightened interest in public health surveillance. The years immediately following these attacks were met with increased awareness and funding from the federal government which has significantly strengthened the United States surveillance capabilities; however, despite these improvements, there are substantial challenges faced by today's public health surveillance systems. Problems with the current surveillance systems include: a) lack of leveraging unstructured public health data for surveillance purposes; and b) lack of information integration and the ability to leverage resources, applications or other surveillance efforts due to systems being built on a centralized model. This research addresses these problems by focusing on the development and evaluation of new informatics methods to improve the public health surveillance. To address the problems above, we first identified a current public surveillance workflow which is affected by the problems described and has the opportunity for enhancement through current informatics techniques. The 122 Mortality Surveillance for Pneumonia and Influenza was chosen as the primary use case for this dissertation work. The second step involved demonstrating the feasibility of using unstructured public health data, in this case death certificates. For this we created and evaluated a pipeline iv composed of a detection rule and natural language processor, for the coding of death certificates and the identification of pneumonia and influenza cases. The second problem was addressed by presenting the rationale of creating a federated model by leveraging grid technology concepts and tools for the sharing and epidemiological analyses of public health data. As a case study of this approach, a secured virtual organization was created where users are able to access two grid data services, using death certificates from the Utah Department of Health, and two analytical grid services, MetaMap and R. A scientific workflow was created using the published services to replicate the mortality surveillance workflow. To validate these approaches, and provide proofs-of-concepts, a series of real-world scenarios were conducted

    Development and Evaluation of an Ontology-Based Quality Metrics Extraction System

    Get PDF
    The Institute of Medicine reports a growing demand in recent years for quality improvement within the healthcare industry. In response, numerous organizations have been involved in the development and reporting of quality measurement metrics. However, disparate data models from such organizations shift the burden of accurate and reliable metrics extraction and reporting to healthcare providers. Furthermore, manual abstraction of quality metrics and diverse implementation of Electronic Health Record (EHR) systems deepens the complexity of consistent, valid, explicit, and comparable quality measurement reporting within healthcare provider organizations. The main objective of this research is to evaluate an ontology-based information extraction framework to utilize unstructured clinical text for defining and reporting quality of care metrics that are interpretable and comparable across different healthcare institutions. All clinical transcribed notes (48,835) from 2,085 patients who had undergone surgery in 2011 at MD Anderson Cancer Center were extracted from their EMR system and pre- processed for identification of section headers. Subsequently, all notes were analyzed by MetaMap v2012 and one XML file was generated per each note. XML outputs were converted into Resource Description Framework (RDF) format. We also developed three ontologies: section header ontology from extracted section headers using RDF standard, concept ontology comprising entities representing five quality metrics from SNOMED (Diabetes, Hypertension, Cardiac Surgery, Transient Ischemic Attack, CNS tumor), and a clinical note ontology that represented clinical note elements and their relationships. All ontologies (Web Ontology Language format) and patient notes (RDFs) were imported into a triple store (AllegroGraph?) as classes and instances respectively. SPARQL information retrieval protocol was used for reporting extracted concepts under four settings: base Natural Language Processing (NLP) output, inclusion of concept ontology, exclusion of negated concepts, and inclusion of section header ontology. Existing manual abstraction data from surgical clinical reviewers, on the same set of patients and documents, was considered as the gold standard. Micro-average results of statistical agreement tests on the base NLP output showed an increase from 59%, 81%, and 68% to 74%, 91%, and 82% (Precision, Recall, F-Measure) respectively after incremental addition of ontology layers. Our study introduced a framework that may contribute to advances in “complementary” components for the existing information extraction systems. The application of an ontology-based approach for natural language processing in our study has provided mechanisms for increasing the performance of such tools. The pivot point for extracting more meaningful quality metrics from clinical narratives is the abstraction of contextual semantics hidden in the notes. We have defined some of these semantics and quantified them in multiple complementary layers in order to demonstrate the importance and applicability of an ontology-based approach in quality metric extraction. The application of such ontology layers introduces powerful new ways of querying context dependent entities from clinical texts. Rigorous evaluation is still necessary to ensure the quality of these “complementary” NLP systems. Moreover, research is needed for creating and updating evaluation guidelines and criteria for assessment of performance and efficiency of ontology-based information extraction in healthcare and to provide a consistent baseline for the purpose of comparing alternative approaches
    corecore