418 research outputs found
A Learning Health Sciences Approach to Understanding Clinical Documentation in Pediatric Rehabilitation Settings
The work presented in this dissertation provides an analysis of clinical documentation that challenges the concepts and thinking surrounding missingness of data from clinical settings and the factors that influence why data are missing. It also foregrounds the critical role of clinical documentation as infrastructure for creating learning health systems (LHS) for pediatric rehabilitation settings. Although completeness of discrete data is limited, the results presented do not reflect the quality of care or the extent of unstructured data that providers document in other locations of the electronic health record (EHR) interface. While some may view imputation and natural language processing as means to address missingness of clinical data, these practices carry biases in their interpretations and issues of validity in results. The factors that influence missingness of discrete clinical data are rooted not just in technical structures, but larger professional, system level and unobservable phenomena that shape provider practices of clinical documentation. This work has implications for how we view clinical documentation as critical infrastructure for LHS, future studies of data quality and health outcomes research, and EHR design and implementation.
The overall research questions for this dissertation are: 1) To what extent can data networks be leveraged to build classifiers of patient functional performance and physical disability? 2) How can discrete clinical data on gross motor function be used to draw conclusions about clinical documentation practices in the EHR for cerebral palsy? 3) Why does missingness of discrete data in the EHR occur? To address these questions, a three-pronged approach is used to examine data completeness and the factors that influence missingness of discrete clinical data in an exemplar pediatric data learning network will be used. As a use-case, evaluation of EHR data completeness of gross motor function related data, populated by providers from 2015-2019 for children with cerebral palsy (CP), will be completed. Mixed methods research strategies will be used to achieve the dissertation objectives, including developing an expert-informed and standards-based phenotype model of gross motor function data as a task-based mechanism, conducting quantitative descriptive analyses of completeness of discrete data in the EHR, and performing qualitative thematic analyses to elicit and interpret the latent concepts that contribute to missingness of discrete data in the EHR. The clinical data for this dissertation are sourced from the Shriners Hospitals for Children (SHC) Health Outcomes Network (SHOnet), while qualitative data were collected through interviews and field observations of clinical providers across three care sites in the SHC system.PHDHlth Infrastr & Lrng Systs PhDUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162994/1/njkoscie_1.pd
Extraction of clinical phenotypes for Alzheimer\u27s disease dementia from clinical notes using natural language processing
OBJECTIVES: There is much interest in utilizing clinical data for developing prediction models for Alzheimer\u27s disease (AD) risk, progression, and outcomes. Existing studies have mostly utilized curated research registries, image analysis, and structured electronic health record (EHR) data. However, much critical information resides in relatively inaccessible unstructured clinical notes within the EHR.
MATERIALS AND METHODS: We developed a natural language processing (NLP)-based pipeline to extract AD-related clinical phenotypes, documenting strategies for success and assessing the utility of mining unstructured clinical notes. We evaluated the pipeline against gold-standard manual annotations performed by 2 clinical dementia experts for AD-related clinical phenotypes including medical comorbidities, biomarkers, neurobehavioral test scores, behavioral indicators of cognitive decline, family history, and neuroimaging findings.
RESULTS: Documentation rates for each phenotype varied in the structured versus unstructured EHR. Interannotator agreement was high (Cohen\u27s kappa = 0.72-1) and positively correlated with the NLP-based phenotype extraction pipeline\u27s performance (average F1-score = 0.65-0.99) for each phenotype.
DISCUSSION: We developed an automated NLP-based pipeline to extract informative phenotypes that may improve the performance of eventual machine learning predictive models for AD. In the process, we examined documentation practices for each phenotype relevant to the care of AD patients and identified factors for success.
CONCLUSION: Success of our NLP-based phenotype extraction pipeline depended on domain-specific knowledge and focus on a specific clinical domain instead of maximizing generalizability
The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities
Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/154448/1/sim8445_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/154448/2/sim8445.pd
Prospect patents, data markets, and the commons in data-driven medicine : openness and the political economy of intellectual property rights
Scholars who point to political influences and the regulatory function of patent courts in the USA have long questioned the courts’ subjective interpretation of what ‘things’ can be claimed as inventions. The present article sheds light on a different but related facet: the role of the courts in regulating knowledge production. I argue that the recent cases decided by the US Supreme Court and the Federal Circuit, which made diagnostics and software very difficult to patent and which attracted criticism for a wealth of different reasons, are fine case studies of the current debate over the proper role of the state in regulating the marketplace and knowledge production in the emerging information economy. The article explains that these patents are prospect patents that may be used by a monopolist to collect data that everybody else needs in order to compete effectively. As such, they raise familiar concerns about failure of coordination emerging as a result of a monopolist controlling a resource such as datasets that others need and cannot replicate. In effect, the courts regulated the market, primarily focusing on ensuring the free flow of data in the emerging marketplace very much in the spirit of the ‘free the data’ language in various policy initiatives, yet at the same time with an eye to boost downstream innovation. In doing so, these decisions essentially endorse practices of personal information processing which constitute a new type of public domain: a source of raw materials which are there for the taking and which have become most important inputs to commercial activity. From this vantage point of view, the legal interpretation of the private and the shared legitimizes a model of data extraction from individuals, the raw material of information capitalism, that will fuel the next generation of data-intensive therapeutics in the field of data-driven medicine
Characterizing Long COVID: Deep Phenotype of a Complex Condition.
BACKGROUND: Numerous publications describe the clinical manifestations of post-acute sequelae of SARS-CoV-2 (PASC or long COVID ), but they are difficult to integrate because of heterogeneous methods and the lack of a standard for denoting the many phenotypic manifestations. Patient-led studies are of particular importance for understanding the natural history of COVID-19, but integration is hampered because they often use different terms to describe the same symptom or condition. This significant disparity in patient versus clinical characterization motivated the proposed ontological approach to specifying manifestations, which will improve capture and integration of future long COVID studies.
METHODS: The Human Phenotype Ontology (HPO) is a widely used standard for exchange and analysis of phenotypic abnormalities in human disease but has not yet been applied to the analysis of COVID-19.
FINDINGS: We identified 303 articles published before April 29, 2021, curated 59 relevant manuscripts that described clinical manifestations in 81 cohorts three weeks or more following acute COVID-19, and mapped 287 unique clinical findings to HPO terms. We present layperson synonyms and definitions that can be used to link patient self-report questionnaires to standard medical terminology. Long COVID clinical manifestations are not assessed consistently across studies, and most manifestations have been reported with a wide range of synonyms by different authors. Across at least 10 cohorts, authors reported 31 unique clinical features corresponding to HPO terms; the most commonly reported feature was Fatigue (median 45.1%) and the least commonly reported was Nausea (median 3.9%), but the reported percentages varied widely between studies.
INTERPRETATION: Translating long COVID manifestations into computable HPO terms will improve analysis, data capture, and classification of long COVID patients. If researchers, clinicians, and patients share a common language, then studies can be compared/pooled more effectively. Furthermore, mapping lay terminology to HPO will help patients assist clinicians and researchers in creating phenotypic characterizations that are computationally accessible, thereby improving the stratification, diagnosis, and treatment of long COVID.
FUNDING: U24TR002306; UL1TR001439; P30AG024832; GBMF4552; R01HG010067; UL1TR002535; K23HL128909; UL1TR002389; K99GM145411
Recommended from our members
Generating Reliable and Responsive Observational Evidence: Reducing Pre-analysis Bias
A growing body of evidence generated from observational data has demonstrated the potential to influence decision-making and improve patient outcomes. For observational evidence to be actionable, however, it must be generated reliably and in a timely manner. Large distributed observational data networks enable research on diverse patient populations at scale and develop new sound methods to improve reproducibility and robustness of real-world evidence. Nevertheless, the problems of generalizability, portability and scalability persist and compound. As analytical methods only partially address bias, reliable observational research (especially in networks) must address the bias at the design stage (i.e., pre-analysis bias) including the strategies for identifying patients of interest and defining comparators.
This thesis synthesizes and enumerates a set of challenges to addressing pre-analysis bias in observational studies and presents mixed-methods approaches and informatics solutions for overcoming a number of those obstacles. We develop frameworks, methods and tools for scalable and reliable phenotyping including data source granularity estimation, comprehensive concept set selection, index date specification, and structured data-based patient review for phenotype evaluation. We cover the research on potential bias in the unexposed comparator definition including systematic background rates estimation and interpretation, and definition and evaluation of the unexposed comparator.
We propose that the use of standardized approaches and methods as described in this thesis not only improves reliability but also increases responsiveness of observational evidence. To test this hypothesis, we designed and piloted a Data Consult Service - a service that generates new on-demand evidence at the bedside. We demonstrate that it is feasible to generate reliable evidence to address clinicians’ information needs in a robust and timely fashion and provide our analysis of the current limitations and future steps needed to scale such a service
A Systematic Review of Natural Language Processing for Knowledge Management in Healthcare
Driven by the visions of Data Science, recent years have seen a paradigm shift in Natural Language Processing (NLP). NLP has set the milestone in text processing and proved to be the preferred choice for researchers in the healthcare domain. The objective of this paper is to identify the potential of NLP, especially, how NLP is used to support the knowledge management process in the healthcare domain, making data a critical and trusted component in improving health outcomes. This paper provides a comprehensive survey of the state-of-the-art NLP research with a particular focus on how knowledge is created, captured, shared, and applied in the healthcare domain. Our findings suggest, first, the techniques of NLP those supporting knowledge management extraction and knowledge capture processes in healthcare. Second, we propose a conceptual model for the knowledge extraction process through NLP. Finally, we discuss a set of issues, challenges, and proposed future research areas
A Systematic Review of Natural Language Processing for Knowledge Management in Healthcare
Driven by the visions of Data Science, recent years have seen a paradigm
shift in Natural Language Processing (NLP). NLP has set the milestone in text
processing and proved to be the preferred choice for researchers in the
healthcare domain. The objective of this paper is to identify the potential of
NLP, especially, how NLP is used to support the knowledge management process in
the healthcare domain, making data a critical and trusted component in
improving the health outcomes. This paper provides a comprehensive survey of
the state-of-the-art NLP research with a particular focus on how knowledge is
created, captured, shared, and applied in the healthcare domain. Our findings
suggest, first, the techniques of NLP those supporting knowledge management
extraction and knowledge capture processes in healthcare. Second, we propose a
conceptual model for the knowledge extraction process through NLP. Finally, we
discuss a set of issues, challenges, and proposed future research areas
- …