2,372 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationElectronic Health Records (EHRs) provide a wealth of information for secondary uses. Methods are developed to improve usefulness of free text query and text processing and demonstrate advantages to using these methods for clinical research, specifically cohort identification and enhancement. Cohort identification is a critical early step in clinical research. Problems may arise when too few patients are identified, or the cohort consists of a nonrepresentative sample. Methods of improving query formation through query expansion are described. Inclusion of free text search in addition to structured data search is investigated to determine the incremental improvement of adding unstructured text search over structured data search alone. Query expansion using topic- and synonym-based expansion improved information retrieval performance. An ensemble method was not successful. The addition of free text search compared to structured data search alone demonstrated increased cohort size in all cases, with dramatic increases in some. Representation of patients in subpopulations that may have been underrepresented otherwise is also shown. We demonstrate clinical impact by showing that a serious clinical condition, scleroderma renal crisis, can be predicted by adding free text search. A novel information extraction algorithm is developed and evaluated (Regular Expression Discovery for Extraction, or REDEx) for cohort enrichment. The REDEx algorithm is demonstrated to accurately extract information from free text clinical iv narratives. Temporal expressions as well as bodyweight-related measures are extracted. Additional patients and additional measurement occurrences are identified using these extracted values that were not identifiable through structured data alone. The REDEx algorithm transfers the burden of machine learning training from annotators to domain experts. We developed automated query expansion methods that greatly improve performance of keyword-based information retrieval. We also developed NLP methods for unstructured data and demonstrate that cohort size can be greatly increased, a more complete population can be identified, and important clinical conditions can be detected that are often missed otherwise. We found a much more complete representation of patients can be obtained. We also developed a novel machine learning algorithm for information extraction, REDEx, that efficiently extracts clinical values from unstructured clinical text, adding additional information and observations over what is available in structured text alone

    Natural language processing (NLP) for clinical information extraction and healthcare research

    Get PDF
    Introduction: Epilepsy is a common disease with multiple comorbidities. Routinely collected health care data have been successfully used in epilepsy research, but they lack the level of detail needed for in-depth study of complex interactions between the aetiology, comorbidities, and treatment that affect patient outcomes. The aim of this work is to use natural language processing (NLP) technology to create detailed disease-specific datasets derived from the free text of clinic letters in order to enrich the information that is already available. Method: An NLP pipeline for the extraction of epilepsy clinical text (ExECT) was redeveloped to extract a wider range of variables. A gold standard annotation set for epilepsy clinic letters was created for the validation of the ExECT v2 output. A set of clinic letters from the Epi25 study was processed and the datasets produced were validated against Swansea Neurology Biobank records. A data linkage study investigating genetic influences on epilepsy outcomes using GP and hospital records was supplemented with the seizure frequency dataset produced by ExECT v2. Results: The validation of ExECT v2 produced overall precision, recall, and F1 score of 0.90, 0.86, and 0.88, respectively. A method of uploading, annotating, and linking genetic variant datasets within the SAIL databank was established. No significant differences in the genetic burden of rare and potentially damaging variants were observed between the individuals with vs without unscheduled admissions, and between individuals on monotherapy vs polytherapy. No significant difference was observed in the genetic burden between people who were seizure free for over a year and those who experienced at least one seizure a year. Conclusion: This work presents successful extraction of epilepsy clinical information and explores how this information can be used in epilepsy research. The approach taken in the development of ExECT v2, and the research linking the NLP outputs, routinely collected health care data, and genetics set the way for wider research

    Unraveling ethnic disparities in antipsychotic prescribing among patients with psychosis: A retrospective cohort study based on electronic clinical records

    Get PDF
    BACKGROUND: Previous studies have shown mixed evidence on ethnic disparities in antipsychotic prescribing among patients with psychosis in the UK, partly due to small sample sizes. This study aimed to examine the current state of antipsychotic prescription with respect to patient ethnicity among the entire population known to a large UK mental health trust with non-affective psychosis, adjusting for multiple potential risk factors. METHODS: This retrospective cohort study included all patients (N = 19,291) who were aged 18 years or over at their first diagnoses of non-affective psychosis (identified with the ICD-10 codes of F20-F29) recorded in electronic health records (EHRs) at the South London and Maudsley NHS Trust until March 2021. The most recently recorded antipsychotic treatments and patient attributes were extracted from EHRs, including both structured fields and free-text fields processed using natural language processing applications. Multivariable logistic regression models were used to calculate the odds ratios (OR) for antipsychotic prescription according to patient ethnicity, adjusted for multiple potential contributing factors, including demographic (age and gender), clinical (diagnoses, duration of illness, service use and history of cannabis use), socioeconomic factors (level of deprivation and own-group ethnic density in the area of residence) and temporal changes in clinical guidelines (date of prescription). RESULTS: The cohort consisted of 43.10 % White, 8.31 % Asian, 40.80 % Black, 2.64 % Mixed, and 5.14 % of patients from Other ethnicity. Among them, 92.62 % had recorded antipsychotic receipt, where 24.05 % for depot antipsychotics and 81.72 % for second-generation antipsychotic (SGA) medications. Most ethnic minority groups were not significantly different from White patients in receiving any antipsychotic. Among those receiving antipsychotic prescribing, Black patients were more likely to be prescribed depot (adjusted OR 1.29, 95 % confidence interval (CI) 1.14-1.47), but less likely to receive SGA (adjusted OR 0.85, 95 % CI 0.74-0.97), olanzapine (OR 0.82, 95 % CI 0.73-0.92) and clozapine (adjusted OR 0.71, 95 % CI 0.6-0.85) than White patients. All the ethnic minority groups were less likely to be prescribed olanzapine than the White group. CONCLUSIONS: Black patients with psychosis had a distinct pattern in antipsychotic prescription, with less use of SGA, including olanzapine and clozapine, but more use of depot antipsychotics, even when adjusting for the effects of multiple demographic, clinical and socioeconomic factors. Further research is required to understand the sources of these ethnic disparities and eliminate care inequalities

    J Biomed Inform

    Get PDF
    We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers. A total of 7149 records (after removing duplicates) were retrieved and screened, and 86 were determined to fit the review criteria. These papers contained information about 71 different clinical NLP systems, which were then analyzed. The NLP systems address a wide variety of important clinical and research tasks. Certain tasks are well addressed by the existing systems, while others remain as open challenges that only a small number of systems attempt, such as extraction of temporal information or normalization of concepts to standard terminologies. This review has identified many NLP systems capable of processing clinical free text and generating structured output, and the information collected and evaluated here will be important for prioritizing development of new approaches for clinical NLP.CC999999/ImCDC/Intramural CDC HHS/United States2019-11-20T00:00:00Z28729030PMC6864736694

    Extracting and Structuring Drug Information to Improve e-Prescription and Streamline Medical Treatment

    Get PDF
    Currently, physicians are using the patient electronic health record (EHR) to support their practice. The Romanian healthcare system switched to the electronic prescription starting with 2012. Physicians use the electronic medical record and health card to access patient data whenever available. To improve the medical act, we propose a tool supporting the prescription process, structuring and extracting important information from drug characteristics leaflets (prospectus). The application processes data extracted from around 3.000 medical prospectuses using several Romanian language Web sources. The drug leaflet data is structured on sections: therapeutic action, contraindications, mode of administration, adverse reactions, etc. A stemming algorithm has been applied to each section, extracting the root of the word for an easy search. The result is a text in an *.xml file. After structuring step, the application searches in the structured file the necessary information to prescribe the patient’s medication as closely as possible related to patient state. The application suggests all the drugs matching the patient's disease and are not contraindicated, or enter in conflict with other diseases, treatments or allergies of the patient, and the physician may select the best solution for the given situation

    Doctor of Philosophy

    Get PDF
    DissertationHealth information technology (HIT) in conjunction with quality improvement (QI) methodologies can promote higher quality care at lower costs. Unfortunately, most inpatient hospital settings have been slow to adopt HIT and QI methodologies. Successful adoption requires close attention to workflow. Workflow is the sequence of tasks, processes, and the set of people or resources needed for those tasks that are necessary to accomplish a given goal. Assessing the impact on workflow is an important component of determining whether a HIT implementation will be successful, but little research has been conducted on the impact of eMeasure (electronic performance measure) implementation on workflow. One solution to addressing implementation challenges such as the lack of attention to workflow is an implementation toolkit. An implementation toolkit is an assembly of instruments such as checklists, forms, and planning documents. We developed an initial eMeasure Implementation Toolkit for the heart failure (HF) eMeasure to allow QI and information technology (IT) professionals and their team to assess the impact of implementation on workflow. During the development phase of the toolkit, we undertook a literature review to determine the components of the toolkit. We conducted stakeholder interviews with HIT and QI key informants and subject matter experts (SMEs) at the US Department of Veteran Affairs (VA). Key informants provided a broad understanding about the context of workflow during eMeasure implementation. Based on snowball sampling, we also interviewed other SMEs based on the recommendations of the key informants who suggested tools and provided information essential to the toolkit development. The second phase involved evaluation of the toolkit for relevance and clarity, by experts in non-VA settings. The experts evaluated the sections of the toolkit that contained the tools, via a survey. The final toolkit provides a distinct set of resources and tools, which were iteratively developed during the research and available to users in a single source document. The research methodology provided a strong unified overarching implementation framework in the form of the Promoting Action on Research Implementation in Health Services (PARIHS) model in combination with a sociotechnical model of HIT that strengthened the overall design of the study

    Practical data collection and extraction for big data applications in radiotherapy

    Full text link
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/146459/1/mp12817.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/146459/2/mp12817_am.pd

    Feasibility study of hospital antimicrobial stewardship analytics using electronic health records

    Get PDF
    Background: Hospital antimicrobial stewardship (AMS) programmes are multidisciplinary initiatives to optimise the use of antimicrobials. Most hospitals depend on time-consuming manual audits to monitor clinicians’ prescribing. But much of the information needed could be sourced from electronic health records (EHRs). Objectives: To develop an informatics methodology to analyse characteristics of hospital AMS practice using routine electronic prescribing and laboratory records. Methods: Feasibility study using electronic prescribing, laboratory and clinical coding records from adult patients admitted to six specialties at Queen Elizabeth Hospital, Birmingham, UK (September 2017–August 2018). The study involved: (1) a review of antimicrobial stewardship standards of care; (2) their translation into concepts measurable from commonly available EHRs; (3) pilot application in an EHR cohort study (n=61,679 admissions). Results: We developed data modelling methods to characterise the use of antimicrobials (antimicrobial therapy episode linkage methods, therapy table, therapy changes). Prescriptions were linked into antimicrobial therapy episodes (mean 2.4 prescriptions/episode; mean length of therapy of 5.8 days) enabling production of several actionable findings. For example, 22% of therapy episodes for low-severity community acquired pneumonia were congruent with prescribing guidelines, with a tendency to use antibiotics with a broader spectrum. Analysis of therapy changes revealed a delay in switching from intravenous to oral therapy by an average 3.6 days [95% CI: 3.4; 3.7]. Performance of microbial cultures prior to treatment initiation occurred in just 22% of antibacterial prescriptions. The proposed methods enabled fine-grained monitoring of AMS practice all the way down to specialties, wards, and individual clinical teams by case mix, enabling more meaningful peer comparison. Conclusions: It is feasible to use hospital EHRs to construct rapid, meaningful measures of prescribing quality with potential to support quality improvement interventions (audit/feedback to prescribers), engagement with front-line clinicians on optimising prescribing, and AMS impact evaluation studies

    A Learning Health System for Radiation Oncology

    Get PDF
    The proposed research aims to address the challenges faced by clinical data science researchers in radiation oncology accessing, integrating, and analyzing heterogeneous data from various sources. The research presents a scalable intelligent infrastructure, called the Health Information Gateway and Exchange (HINGE), which captures and structures data from multiple sources into a knowledge base with semantically interlinked entities. This infrastructure enables researchers to mine novel associations and gather relevant knowledge for personalized clinical outcomes. The dissertation discusses the design framework and implementation of HINGE, which abstracts structured data from treatment planning systems, treatment management systems, and electronic health records. It utilizes disease-specific smart templates for capturing clinical information in a discrete manner. HINGE performs data extraction, aggregation, and quality and outcome assessment functions automatically, connecting seamlessly with local IT/medical infrastructure. Furthermore, the research presents a knowledge graph-based approach to map radiotherapy data to an ontology-based data repository using FAIR (Findable, Accessible, Interoperable, Reusable) concepts. This approach ensures that the data is easily discoverable and accessible for clinical decision support systems. The dissertation explores the ETL (Extract, Transform, Load) process, data model frameworks, ontologies, and provides a real-world clinical use case for this data mapping. To improve the efficiency of retrieving information from large clinical datasets, a search engine based on ontology-based keyword searching and synonym-based term matching tool was developed. The hierarchical nature of ontologies is leveraged to retrieve patient records based on parent and children classes. Additionally, patient similarity analysis is conducted using vector embedding models (Word2Vec, Doc2Vec, GloVe, and FastText) to identify similar patients based on text corpus creation methods. Results from the analysis using these models are presented. The implementation of a learning health system for predicting radiation pneumonitis following stereotactic body radiotherapy is also discussed. 3D convolutional neural networks (CNNs) are utilized with radiographic and dosimetric datasets to predict the likelihood of radiation pneumonitis. DenseNet-121 and ResNet-50 models are employed for this study, along with integrated gradient techniques to identify salient regions within the input 3D image dataset. The predictive performance of the 3D CNN models is evaluated based on clinical outcomes. Overall, the proposed Learning Health System provides a comprehensive solution for capturing, integrating, and analyzing heterogeneous data in a knowledge base. It offers researchers the ability to extract valuable insights and associations from diverse sources, ultimately leading to improved clinical outcomes. This work can serve as a model for implementing LHS in other medical specialties, advancing personalized and data-driven medicine

    How to translate therapeutic recommendations in clinical practice guidelines into rules for critiquing physician prescriptions? Methods and application to five guidelines

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Clinical practice guidelines give recommendations about what to do in various medical situations, including therapeutical recommendations for drug prescription. An effective way to computerize these recommendations is to design critiquing decision support systems, <it>i.e</it>. systems that criticize the physician's prescription when it does not conform to the guidelines. These systems are commonly based on a list of "if conditions then criticism" rules. However, writing these rules from the guidelines is not a trivial task. The objective of this article is to propose methods that (1) simplify the implementation of guidelines' therapeutical recommendations in critiquing systems by automatically translating structured therapeutical recommendations into a list of "if conditions then criticize" rules, and (2) can generate an appropriate textual label to explain to the physician why his/her prescription is not recommended.</p> <p>Methods</p> <p>We worked on the therapeutic recommendations in five clinical practice guidelines concerning chronic diseases related to the management of cardiovascular risk. We evaluated the system using a test base of more than 2000 cases.</p> <p>Results</p> <p>Algorithms for automatically translating therapeutical recommendations into "if conditions then criticize" rules are presented. Eight generic recommendations are also proposed; they are guideline-independent, and can be used as default behaviour for handling various situations that are usually implicit in the guidelines, such as decreasing the dose of a poorly tolerated drug. Finally, we provide models and methods for generating a human-readable textual critique. The system was successfully evaluated on the test base.</p> <p>Conclusion</p> <p>We show that it is possible to criticize physicians' prescriptions starting from a structured clinical guideline, and to provide clear explanations. We are now planning a randomized clinical trial to evaluate the impact of the system on practices.</p
    corecore