39 research outputs found

    Semantic processing of EHR data for clinical research

    Get PDF
    There is a growing need to semantically process and integrate clinical data from different sources for clinical research. This paper presents an approach to integrate EHRs from heterogeneous resources and generate integrated data in different data formats or semantics to support various clinical research applications. The proposed approach builds semantic data virtualization layers on top of data sources, which generate data in the requested semantics or formats on demand. This approach avoids upfront dumping to and synchronizing of the data with various representations. Data from different EHR systems are first mapped to RDF data with source semantics, and then converted to representations with harmonized domain semantics where domain ontologies and terminologies are used to improve reusability. It is also possible to further convert data to application semantics and store the converted results in clinical research databases, e.g. i2b2, OMOP, to support different clinical research settings. Semantic conversions between different representations are explicitly expressed using N3 rules and executed by an N3 Reasoner (EYE), which can also generate proofs of the conversion processes. The solution presented in this paper has been applied to real-world applications that process large scale EHR data.Comment: Accepted for publication in Journal of Biomedical Informatics, 2015, preprint versio

    Evaluating openEHR for storing computable representations of electronic health record phenotyping algorithms

    Get PDF
    Electronic Health Records (EHR) are data generated during routine clinical care. EHR offer researchers unprecedented phenotypic breadth and depth and have the potential to accelerate the pace of precision medicine at scale. A main EHR use-case is creating phenotyping algorithms to define disease status, onset and severity. Currently, no common machine-readable standard exists for defining phenotyping algorithms which often are stored in human-readable formats. As a result, the translation of algorithms to implementation code is challenging and sharing across the scientific community is problematic. In this paper, we evaluate openEHR, a formal EHR data specification, for computable representations of EHR phenotyping algorithms.Comment: 30th IEEE International Symposium on Computer-Based Medical Systems - IEEE CBMS 201

    Clinical Bioinformatics: challenges and opportunities

    Get PDF
    Background: Network Tools and Applications in Biology (NETTAB) Workshops are a series of meetings focused on the most promising and innovative ICT tools and to their usefulness in Bioinformatics. The NETTAB 2011 workshop, held in Pavia, Italy, in October 2011 was aimed at presenting some of the most relevant methods, tools and infrastructures that are nowadays available for Clinical Bioinformatics (CBI), the research field that deals with clinical applications of bioinformatics. Methods: In this editorial, the viewpoints and opinions of three world CBI leaders, who have been invited to participate in a panel discussion of the NETTAB workshop on the next challenges and future opportunities of this field, are reported. These include the development of data warehouses and ICT infrastructures for data sharing, the definition of standards for sharing phenotypic data and the implementation of novel tools to implement efficient search computing solutions. Results: Some of the most important design features of a CBI-ICT infrastructure are presented, including data warehousing, modularity and flexibility, open-source development, semantic interoperability, integrated search and retrieval of –omics information. Conclusions: Clinical Bioinformatics goals are ambitious. Many factors, including the availability of high-throughput “-omics” technologies and equipment, the widespread availability of clinical data warehouses and the noteworthy increase in data storage and computational power of the most recent ICT systems, justify research and efforts in this domain, which promises to be a crucial leveraging factor for biomedical research

    Ontology-based data integration between clinical and research systems

    Get PDF
    Data from the electronic medical record comprise numerous structured but uncoded elements, which are not linked to standard terminologies. Reuse of such data for secondary research purposes has gained in importance recently. However, the identification of relevant data elements and the creation of database jobs for extraction, transformation and loading (ETL) are challenging: With current methods such as data warehousing, it is not feasible to efficiently maintain and reuse semantically complex data extraction and trans-formation routines. We present an ontology-supported approach to overcome this challenge by making use of abstraction: Instead of defining ETL procedures at the database level, we use ontologies to organize and describe the medical concepts of both the source system and the target system. Instead of using unique, specifically developed SQL statements or ETL jobs, we define declarative transformation rules within ontologies and illustrate how these constructs can then be used to automatically generate SQL code to perform the desired ETL procedures. This demonstrates how a suitable level of abstraction may not only aid the interpretation of clinical data, but can also foster the reutilization of methods for un-locking it

    Intégration de ressources en recherche translationnelle : une approche unificatrice en support des systèmes de santé "apprenants"

    Get PDF
    Learning health systems (LHS) are gradually emerging and propose a complimentary approach to translational research challenges by implementing close coupling of health care delivery, research and knowledge translation. To support coherent knowledge sharing, the system needs to rely on an integrated and efficient data integration platform. The framework and its theoretical foundations presented here aim at addressing this challenge. Data integration approaches are analysed in light of the requirements derived from LHS activities and data mediation emerges as the one most adapted for a LHS. The semantics of clinical data found in biomedical sources can only be fully derived by taking into account, not only information from the structural models (field X of table Y), but also terminological information (e.g. International Classification of Disease 10th revision) used to encode facts. The unified framework proposed here takes this into account. The platform has been implemented and tested in context of the TRANSFoRm endeavour, a European project funded by the European commission. It aims at developing a LHS including clinical activities in primary care. The mediation model developed for the TRANSFoRm project, the Clinical Data Integration Model, is presented and discussed. Results from TRANSFoRm use-cases are presented. They illustrate how a unified data sharing platform can support and enhance prospective research activities in context of a LHS. In the end, the unified mediation framework presented here allows sufficient expressiveness for the TRANSFoRm needs. It is flexible, modular and the CDIM mediation model supports the requirements of a primary care LHS.Les systèmes de santé "apprenants" (SSA) présentent une approche complémentaire et émergente aux problèmes de la recherche translationnelle en couplant de près les soins de santé, la recherche et le transfert de connaissances. Afin de permettre un flot d’informations cohérent et optimisé, le système doit se doter d’une plateforme intégrée de partage de données. Le travail présenté ici vise à proposer une approche de partage de données unifiée pour les SSA. Les grandes approches d’intégration de données sont analysées en fonction du SSA. La sémantique des informations cliniques disponibles dans les sources biomédicales est la résultante des connaissances des modèles structurelles des sources mais aussi des connaissances des modèles terminologiques utilisés pour coder l’information. Les mécanismes de la plateforme unifiée qui prennent en compte cette interdépendance sont décrits. La plateforme a été implémentée et testée dans le cadre du projet TRANSFoRm, un projet européen qui vise à développer un SSA. L’instanciation du modèle de médiation pour le projet TRANSFoRm, le Clinical Data Integration Model est analysée. Sont aussi présentés ici les résultats d’un des cas d’utilisation de TRANSFoRm pour supporter la recherche afin de donner un aperçu concret de l’impact de la plateforme sur le fonctionnement du SSA. Au final, la plateforme unifiée d’intégration proposée ici permet un niveau d’expressivité suffisant pour les besoins de TRANSFoRm. Le système est flexible et modulaire et le modèle de médiation CDIM couvre les besoins exprimés pour le support des activités d’un SSA comme TRANSFoRm

    Cohort Identification Using Semantic Web Technologies: Ontologies and Triplestores as Engines for Complex Computable Phenotyping

    Get PDF
    Electronic health record (EHR)-based computable phenotypes are algorithms used to identify individuals or populations with clinical conditions or events of interest within a clinical data repository. Due to a lack of EHR data standardization, computable phenotypes can be semantically ambiguous and difficult to share across institutions. In this research, I propose a new computable phenotyping methodological framework based on semantic web technologies, specifically ontologies, the Resource Description Framework (RDF) data format, triplestores, and Web Ontology Language (OWL) reasoning. My hypothesis is that storing and analyzing clinical data using these technologies can begin to address the critical issues of semantic ambiguity and lack of interoperability in the context of computable phenotyping. To test this hypothesis, I compared the performance of two variants of two computable phenotypes (for depression and rheumatoid arthritis, respectively). The first variant of each phenotype used a list of ICD-10-CM codes to define the condition; the second variant used ontology concepts from SNOMED and the Human Phenotype Ontology (HPO). After executing each variant of each phenotype against a clinical data repository, I compared the patients matched in each case to see where the different variants overlapped and diverged. Both the ontologies and the clinical data were stored in an RDF triplestore to allow me to assess the interoperability advantages of the RDF format for clinical data. All tested methods successfully identified cohorts in the data store, with differing rates of overlap and divergence between variants. Depending on the phenotyping use case, SNOMED and HPO’s ability to more broadly define many conditions due to complex relationships between their concepts may be seen as an advantage or a disadvantage. I also found that RDF triplestores do indeed provide interoperability advantages, despite being far less commonly used in clinical data applications than relational databases. Despite the fact that these methods and technologies are not “one-size-fits-all,” the experimental results are encouraging enough for them to (1) be put into practice in combination with existing phenotyping methods or (2) be used on their own for particularly well-suited use cases.Doctor of Philosoph

    Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress

    Get PDF
    Objective: To perform a review of recent research in clinical data reuse or secondary use, and envision future advances in this field. Methods: The review is based on a large literature search in MEDLINE (through PubMed), conference proceedings, and the ACM Digital Library, focusing only on research published between 2005 and early 2016. Each selected publication was reviewed by the authors, and a structured analysis and summarization of its content was developed. Results: The initial search produced 359 publications, reduced after a manual examination of abstracts and full publications. The following aspects of clinical data reuse are discussed: motivations and challenges, privacy and ethical concerns, data integration and interoperability, data models and terminologies, unstructured data reuse, structured data mining, clinical practice and research integration, and examples of clinical data reuse (quality measurement and learning healthcare systems). Conclusion: Reuse of clinical data is a fast-growing field recognized as essential to realize the potentials for high quality healthcare, improved healthcare management, reduced healthcare costs, population health management, and effective clinical research

    Nine Principles of Semantic Harmonization

    Get PDF
    Medical data is routinely collected, stored and recorded across different institutions and in a range of different formats. Semantic harmonization is the process of collating this data into a singular consistent logical view, with many approaches to harmonizing both possible and valid. The broad scope of possibilities for undertaking semantic harmonization do lead however to the development of bespoke and ad-hoc systems; this is particularly the case when it comes to cohort data, the format of which is often specific to a cohort's area of focus. Guided by work we have undertaken in developing the 'EMIF Knowledge Object Library', a semantic harmonization framework underpinning the collation of pan-European Alzheimer's cohort data, we have developed a set of nine generic guiding principles for developing semantic harmonization frameworks, the application of which will establish a solid base for constructing similar frameworks
    corecore