169 research outputs found

    Cohort Identification Using Semantic Web Technologies: Ontologies and Triplestores as Engines for Complex Computable Phenotyping

    Get PDF
    Electronic health record (EHR)-based computable phenotypes are algorithms used to identify individuals or populations with clinical conditions or events of interest within a clinical data repository. Due to a lack of EHR data standardization, computable phenotypes can be semantically ambiguous and difficult to share across institutions. In this research, I propose a new computable phenotyping methodological framework based on semantic web technologies, specifically ontologies, the Resource Description Framework (RDF) data format, triplestores, and Web Ontology Language (OWL) reasoning. My hypothesis is that storing and analyzing clinical data using these technologies can begin to address the critical issues of semantic ambiguity and lack of interoperability in the context of computable phenotyping. To test this hypothesis, I compared the performance of two variants of two computable phenotypes (for depression and rheumatoid arthritis, respectively). The first variant of each phenotype used a list of ICD-10-CM codes to define the condition; the second variant used ontology concepts from SNOMED and the Human Phenotype Ontology (HPO). After executing each variant of each phenotype against a clinical data repository, I compared the patients matched in each case to see where the different variants overlapped and diverged. Both the ontologies and the clinical data were stored in an RDF triplestore to allow me to assess the interoperability advantages of the RDF format for clinical data. All tested methods successfully identified cohorts in the data store, with differing rates of overlap and divergence between variants. Depending on the phenotyping use case, SNOMED and HPO’s ability to more broadly define many conditions due to complex relationships between their concepts may be seen as an advantage or a disadvantage. I also found that RDF triplestores do indeed provide interoperability advantages, despite being far less commonly used in clinical data applications than relational databases. Despite the fact that these methods and technologies are not “one-size-fits-all,” the experimental results are encouraging enough for them to (1) be put into practice in combination with existing phenotyping methods or (2) be used on their own for particularly well-suited use cases.Doctor of Philosoph

    Deep Risk Prediction and Embedding of Patient Data: Application to Acute Gastrointestinal Bleeding

    Get PDF
    Acute gastrointestinal bleeding is a common and costly condition, accounting for over 2.2 million hospital days and 19.2 billion dollars of medical charges annually. Risk stratification is a critical part of initial assessment of patients with acute gastrointestinal bleeding. Although all national and international guidelines recommend the use of risk-assessment scoring systems, they are not commonly used in practice, have sub-optimal performance, may be applied incorrectly, and are not easily updated. With the advent of widespread electronic health record adoption, longitudinal clinical data captured during the clinical encounter is now available. However, this data is often noisy, sparse, and heterogeneous. Unsupervised machine learning algorithms may be able to identify structure within electronic health record data while accounting for key issues with the data generation process: measurements missing-not-at-random and information captured in unstructured clinical note text. Deep learning tools can create electronic health record-based models that perform better than clinical risk scores for gastrointestinal bleeding and are well-suited for learning from new data. Furthermore, these models can be used to predict risk trajectories over time, leveraging the longitudinal nature of the electronic health record. The foundation of creating relevant tools is the definition of a relevant outcome measure; in acute gastrointestinal bleeding, a composite outcome of red blood cell transfusion, hemostatic intervention, and all-cause 30-day mortality is a relevant, actionable outcome that reflects the need for hospital-based intervention. However, epidemiological trends may affect the relevance and effectiveness of the outcome measure when applied across multiple settings and patient populations. Understanding the trends in practice, potential areas of disparities, and value proposition for using risk stratification in patients presenting to the Emergency Department with acute gastrointestinal bleeding is important in understanding how to best implement a robust, generalizable risk stratification tool. Key findings include a decrease in the rate of red blood cell transfusion since 2014 and disparities in access to upper endoscopy for patients with upper gastrointestinal bleeding by race/ethnicity across urban and rural hospitals. Projected accumulated savings of consistent implementation of risk stratification tools for upper gastrointestinal bleeding total approximately $1 billion 5 years after implementation. Most current risk scores were designed for use based on the location of the bleeding source: upper or lower gastrointestinal tract. However, the location of the bleeding source is not always clear at presentation. I develop and validate electronic health record based deep learning and machine learning tools for patients presenting with symptoms of acute gastrointestinal bleeding (e.g., hematemesis, melena, hematochezia), which is more relevant and useful in clinical practice. I show that they outperform leading clinical risk scores for upper and lower gastrointestinal bleeding, the Glasgow Blatchford Score and the Oakland score. While the best performing gradient boosted decision tree model has equivalent overall performance to the fully connected feedforward neural network model, at the very low risk threshold of 99% sensitivity the deep learning model identifies more very low risk patients. Using another deep learning model that can model longitudinal risk, the long-short-term memory recurrent neural network, need for transfusion of red blood cells can be predicted at every 4-hour interval in the first 24 hours of intensive care unit stay for high risk patients with acute gastrointestinal bleeding. Finally, for implementation it is important to find patients with symptoms of acute gastrointestinal bleeding in real time and characterize patients by risk using available data in the electronic health record. A decision rule-based electronic health record phenotype has equivalent performance as measured by positive predictive value compared to deep learning and natural language processing-based models, and after live implementation appears to have increased the use of the Acute Gastrointestinal Bleeding Clinical Care pathway. Patients with acute gastrointestinal bleeding but with other groups of disease concepts can be differentiated by directly mapping unstructured clinical text to a common ontology and treating the vector of concepts as signals on a knowledge graph; these patients can be differentiated using unbalanced diffusion earth mover’s distances on the graph. For electronic health record data with data missing not at random, MURAL, an unsupervised random forest-based method, handles data with missing values and generates visualizations that characterize patients with gastrointestinal bleeding. This thesis forms a basis for understanding the potential for machine learning and deep learning tools to characterize risk for patients with acute gastrointestinal bleeding. In the future, these tools may be critical in implementing integrated risk assessment to keep low risk patients out of the hospital and guide resuscitation and timely endoscopic procedures for patients at higher risk for clinical decompensation

    The integration of WHO classifications and reference terminologies to improve information exchange and quality of electronic health records: the SNOMED\u2013CT ICF harmonization within the ICD-11 revision process

    Get PDF
    Introduction The Family of International Classifications (WHO-FIC) is a suite of integrated classification products of the World Health Organization (WHO) that can be used to provide information on different aspects of health and the health-care system. These tools and their national modifications allow, together with the related classifications of health interventions, full representation of the volumes of health services provided in the various countries that adopt case mix systems. The use of standardized terminologies in classifications, for the definition of the descriptive characteristics of the disease, is a necessary step to allow full integration between different information systems, making available information about the diagnosed diseases, the performed health procedures and the level of functioning of the person, for very different uses such as, for example, public health, safety of care and quality control. Materials and methods Within the WHO and International Health Terminology Standards Development Organization (IHTSDO) collaboration agreement, a work of independent review was carried out on all the Activities and Participation categories (A&P) of the WHO International Classification of Functioning, Disability and Health (ICF), in order to identify equivalence and gaps to the Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) concepts in terms of lexical, semantic (content) and hierarchical matching, to harmonize WHO classifications and SNOMED CT. Results and conclusions The performed mapping suggests that the ICF A&P categories are semantically and hierarchically different from the terms of SNOMED CT thus confirming the high value of the WHO-IHTSDO synergy aiming to frame together, in a joint effort, their respective unique contribution. Recommendations were formulated to WHO and IHTSDO in order to better frame together, in a joint effort, their respective unique contribution ensuring that SNOMED CT and ICF can interoperate in electronic health records

    Vocabulary services for eHealth applications in Portugal

    Get PDF
    Mestrado em Engenharia Electrónica e TelecomunicaçõesO uso seguro de eSa ude requer que as ferramentas de informa c~ao partilhem a mesma interpreta c~ao de dados mas, no actual estado das implanta c~oes, os sistemas s~ao normalmente heterog eneos e adoptam modelos de informa c~ao locais. A falta de solu c~oes para a comunica c~ao entre diferentes sistemas a n vel t ecnico e especialmente a n vel sem^antico di culta a capacidade de usar a informa c~ao relativa ao mesmo utente de forma continuada entre m ultiplos sistemas. Uma contribui c~ao parcial para facilitar a integra c~ao de fontes de informa c~ao diferentes e o uso de terminologias m edicas, que clari cam o uso pretendido de certos campos da informa c~ao e os respectivos valores. Neste trabalho e proposto o uso de um servidor de vocabul ario como um componente central do sistema com o objectivo de satisfazer dois casos de uso mais pertinentes: (1) criar um servi co de refer^encia para a realidade portuguesa e (2) permitir a transforma c~ao de estruturas de informa c~ao para outros modelos cl nicos (para cen arios de interoperabilidade). A ferramenta proposta, al em de funcionar como um servidor de terminologias relevantes para o sistema de sa ude portugu^es, e tamb em capaz de modelar associa c~oes sem^anticas entre terminologias diferentes, permitindo assim a tradu c~ao e transcodi ca c~ao de conceitos. As especi cidades da rede de interoperabilidade do epSOS foram tomadas em considera c~ao para o desenvolvimento das especi ca c~oes. O sistema possui a capacidade de mapear terminologias carregadas, oferece uma representa c~ao dessa informa c~ao (e.g. vista de um grafo de conceitos relacionada com uma doen ca espec ca) e permite importar essa mesma informa c~ao nos formatos RDF e JSON. Uma interface de programa c~ao de aplica c~oes (API) foi desenvolvida para permitir a um utilizador fazer interroga c~oes sem^anticas de alto n vel, como por exemplo, o mapeamento entre terminologias usadas no sistema de sa ude portugu^es. Os resultados deste trabalho podem facilitar o desenvolvimento de solu c~oes em eSa ude atrav es da disponibiliza c~ao de servi cos b asicos relacionados com terminologias, melhorando assim a interoperabilidade das aplica c~oes.The safe use of eHealth requires that information tools share the same interpretation of the data but, in the current state of the implementations, systems are often heterogeneous and adopt local information models. The lack of interfacing solutions between di erent systems at the technical and, specially, semantic level, hinders the ability to use seamlessly information for the same patient, available at multiple sources. A partial contribution to facilitate the integration of di erent information sources is the use of medical terminologies, which clarify the intended use of certain data elds and the possible value sets. In this work, we propose the use of a vocabulary server as a central component to enable two motivating use cases: (1) enable a reference semantic service for the Portuguese reality and (2) enable the transformation of clinical data structures into other clinical models (for interoperability scenarios). The proposed tool, besides serving terminologies relevant to the Portuguese health system, is also capable of modelling semantic associations between di erent terminology systems to enable translation and transcoding. The speci c requirements of the epSOS interoperability network were used to drive the speci cation. The system is able to link terminologies, o er a visual representation of that information (e.g. the viewing of a graph of concepts related to a speci c disease) and allows that information extraction in RDF and JSON formats. An application programming interface was developed to enable developer to issue high-level semantic interrogations like, for example, mapping between terminology systems used in the Portuguese health system. The results of this work can facilitate eHealth solutions developers on getting basic terminology services to extend their applications towards enhanced interoperability

    Front-Line Physicians' Satisfaction with Information Systems in Hospitals

    Get PDF
    Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe

    Issues of the adoption of HIT related standards at the decision-making stage of six tertiary healthcare organisations in Saudi Arabia

    Get PDF
    Due to interoperability barriers between clinical information systems, healthcare organisations are facing potential limitations with regard to acquiring the benefits such systems offer; in particular, in terms of reducing the cost of medical services. However, to achieve the level of interoperability required to reduce these problems, a high degree of consensus is required regarding health data standards. Although such standards essentially constitute a solution to the interoperability barriers mentioned above, the level of adoption of these standards remains frustratingly low. One reason for this is that health data standards are an authoritative field in which marketplace mechanisms do not work owing to the fact that health data standards developed for a particular market cannot, in general, be applied in other markets without modification. Many countries have launched national initiatives to develop and promote national health data standards but, although certain authors have mapped the landscape of the standardisation process for health data in some countries, these studies have failed to explain why the healthcare organisations seem unwilling to adopt those standards. In addressing this gap in the literature, a conceptual model of the adoption process of HIT related standards at the decision-making stage in healthcare organisations is proposed in this research. This model was based on two predominant theories regarding IT related standards in the IS field: Rogers paradigm (1995) and the economics of standards theory. In addition, the twenty one constructs of this model resulted from a comprehensive set of factors derived from the related literature; these were then grouped in accordance with the Technology-Organisation Environment (TOE), a well-known taxonomy within innovation adoption studies in the IS field. Moving from a conceptual to an empirical position, an interpretive, exploratory, multiple-case study methodology was conducted in Saudi Arabia to examine the proposed model. The empirical qualitative evidence gained necessitated some revision to be made to the proposed model. One factor was abandoned, four were modified and eight new factors were added. This consistent empirical model makes a novel contribution at two levels. First, with regard to the body of knowledge in the IS area, this model offers an in-depth understanding of the adoption process of HIT related standards which the literature still lacks. It also examines the applicability of IS theories in a new area which allows others to relate their experiences to those reported. Secondly, this model can be used by decision makers in the healthcare sector, particularly those in developing countries, as a guideline while planning for the adoption of health data standards

    Automatic Population of Structured Reports from Narrative Pathology Reports

    Get PDF
    There are a number of advantages for the use of structured pathology reports: they can ensure the accuracy and completeness of pathology reporting; it is easier for the referring doctors to glean pertinent information from them. The goal of this thesis is to extract pertinent information from free-text pathology reports and automatically populate structured reports for cancer diseases and identify the commonalities and differences in processing principles to obtain maximum accuracy. Three pathology corpora were annotated with entities and relationships between the entities in this study, namely the melanoma corpus, the colorectal cancer corpus and the lymphoma corpus. A supervised machine-learning based-approach, utilising conditional random fields learners, was developed to recognise medical entities from the corpora. By feature engineering, the best feature configurations were attained, which boosted the F-scores significantly from 4.2% to 6.8% on the training sets. Without proper negation and uncertainty detection, the quality of the structured reports will be diminished. The negation and uncertainty detection modules were built to handle this problem. The modules obtained overall F-scores ranging from 76.6% to 91.0% on the test sets. A relation extraction system was presented to extract four relations from the lymphoma corpus. The system achieved very good performance on the training set, with 100% F-score obtained by the rule-based module and 97.2% F-score attained by the support vector machines classifier. Rule-based approaches were used to generate the structured outputs and populate them to predefined templates. The rule-based system attained over 97% F-scores on the training sets. A pipeline system was implemented with an assembly of all the components described above. It achieved promising results in the end-to-end evaluations, with 86.5%, 84.2% and 78.9% F-scores on the melanoma, colorectal cancer and lymphoma test sets respectively
    • …
    corecore