Search CORE

5 research outputs found

An information model for computable cancer phenotypes

Author
Publication venue: BioMed Central
Publication date: 15/09/2016
Field of study

Springer - Publisher Connector

Review and evaluation of electronic health records-driven phenotype algorithm authoring tools for clinical and translational research

Author: Albers
Bazarian
Berthold
Burton
Cimino
Cimino
Cimino
Conway
D'Avolio
Danciu
Davis
De Clercq
Delaney
Deng
Denny
Doods
Embi
Enid Montague
Fan
Fernández-Breis
Feyisetan
Guoqian Jiang
Harris
Hellenman
Hetland
Hey
Horvath
Horvath
Hripcsak
Hruby
Huan Mo
Hurdle
Huser
Huser
Huser
Jennifer A Pacheco
Jie Xu
Joshua C Denny
Jyotishman Pathak
Köpcke
Li
Li
Lowe
Ludvigsson
Luke V Rasmussen
Manolio
McCarty
Miyoshi
Mo
Mullins
Murphy
Murphy
Murphy
Murphy
Nadkarni
Narus
Ouagne
Pamela L Shaw
Peissig
Pennington
Peter Speltz
Peterson
Pierce
Post
Pressler
Qian Zhu
Rasmussen
Rasmussen
Richard C Kiefer
Richesson
Roden
Safran
Scully
Shivade
Sladek
Spivey
Tate
Weber
William K Thompson
Zhang
Zhang
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

Cohort Identification Using Semantic Web Technologies: Ontologies and Triplestores as Engines for Complex Computable Phenotyping

Author: Pfaff Emily
Publication venue: University of North Carolina at Chapel Hill Graduate School
Publication date: 01/01/2020
Field of study

Electronic health record (EHR)-based computable phenotypes are algorithms used to identify individuals or populations with clinical conditions or events of interest within a clinical data repository. Due to a lack of EHR data standardization, computable phenotypes can be semantically ambiguous and difficult to share across institutions. In this research, I propose a new computable phenotyping methodological framework based on semantic web technologies, specifically ontologies, the Resource Description Framework (RDF) data format, triplestores, and Web Ontology Language (OWL) reasoning. My hypothesis is that storing and analyzing clinical data using these technologies can begin to address the critical issues of semantic ambiguity and lack of interoperability in the context of computable phenotyping. To test this hypothesis, I compared the performance of two variants of two computable phenotypes (for depression and rheumatoid arthritis, respectively). The first variant of each phenotype used a list of ICD-10-CM codes to define the condition; the second variant used ontology concepts from SNOMED and the Human Phenotype Ontology (HPO). After executing each variant of each phenotype against a clinical data repository, I compared the patients matched in each case to see where the different variants overlapped and diverged. Both the ontologies and the clinical data were stored in an RDF triplestore to allow me to assess the interoperability advantages of the RDF format for clinical data. All tested methods successfully identified cohorts in the data store, with differing rates of overlap and divergence between variants. Depending on the phenotyping use case, SNOMED and HPO’s ability to more broadly define many conditions due to complex relationships between their concepts may be seen as an advantage or a disadvantage. I also found that RDF triplestores do indeed provide interoperability advantages, despite being far less commonly used in clinical data applications than relational databases. Despite the fact that these methods and technologies are not “one-size-fits-all,” the experimental results are encouraging enough for them to (1) be put into practice in combination with existing phenotyping methods or (2) be used on their own for particularly well-suited use cases.Doctor of Philosoph

Carolina Digital Repository

Biomedical Literature Mining and Knowledge Discovery of Phenotyping Definitions

Author: Binkheder Samar Hussein
Publication venue
Publication date: 01/07/2019
Field of study

Indiana University-Purdue University Indianapolis (IUPUI)Phenotyping definitions are essential in cohort identification when conducting clinical research, but they become an obstacle when they are not readily available. Developing new definitions manually requires expert involvement that is labor-intensive, time-consuming, and unscalable. Moreover, automated approaches rely mostly on electronic health records’ data that suffer from bias, confounding, and incompleteness. Limited efforts established in utilizing text-mining and data-driven approaches to automate extraction and literature-based knowledge discovery of phenotyping definitions and to support their scalability. In this dissertation, we proposed a text-mining pipeline combining rule-based and machine-learning methods to automate retrieval, classification, and extraction of phenotyping definitions’ information from literature. To achieve this, we first developed an annotation guideline with ten dimensions to annotate sentences with evidence of phenotyping definitions' modalities, such as phenotypes and laboratories. Two annotators manually annotated a corpus of sentences (n=3,971) extracted from full-text observational studies’ methods sections (n=86). Percent and Kappa statistics showed high inter-annotator agreement on sentence-level annotations. Second, we constructed two validated text classifiers using our annotated corpora: abstract-level and full-text sentence-level. We applied the abstract-level classifier on a large-scale biomedical literature of over 20 million abstracts published between 1975 and 2018 to classify positive abstracts (n=459,406). After retrieving their full-texts (n=120,868), we extracted sentences from their methods sections and used the full-text sentence-level classifier to extract positive sentences (n=2,745,416). Third, we performed a literature-based discovery utilizing the positively classified sentences. Lexica-based methods were used to recognize medical concepts in these sentences (n=19,423). Co-occurrence and association methods were used to identify and rank phenotype candidates that are associated with a phenotype of interest. We derived 12,616,465 associations from our large-scale corpus. Our literature-based associations and large-scale corpus contribute in building new data-driven phenotyping definitions and expanding existing definitions with minimal expert involvement

IUPUIScholarWorks

German medical data sciences: visions and bridges: proceedings of the 62nd Annual Meeting of the German Association of Medical Informatics, Biometry and Epidemiology (gmds e.V.) 2017 in Oldenburg (Oldenburg) - GMDS 2017

Author
Publication venue: 'IOS Press'
Publication date: 01/01/2017
Field of study

Digitale Bibliothek Thüringen