622 research outputs found
Statistical analysis and data mining of Medicare patients with diabetes.
The purpose of this dissertation is to find ways to decrease Medicare costs and to study health outcomes of diabetes patients as well as to investigate the influence of Medicare, part D since its introduction in 2006 using the CMS CCW (Chronic Condition Data Warehouse) Data and the MEPS (Medical Expenditure Panel Survey) data. In this dissertation, we introduce pattern recognition analysis into the study of medical characteristics and demographic characteristics of the inpatients who have a higher readmission risk. We also broaden the cost-effectiveness analysis by including medical resources usage when investigating the effects of Medicare, part D. In addition, we apply several statistical linear models such as the generalized linear model and data mining techniques such as the neural network model to study the costs and outcomes of both inpatients and outpatients with diabetes in Medicare. Moreover, some descriptive statistics such as kernel density estimation and survival analysis are also employed. One important conclusion from these analyses is that only diseases and procedures, rather than age are key factors to inpatients\u27 mortality rate. Another important discovery is that at the influence of Medicare part 0, insulin is the most efficient oral anti-diabetes drug treatment and that the drug usage in 2006 is not as stable as that in 2005. We also find that the patients who are discharged to home or hospice are more likely to re-enter the hospital after discharge within 30 days. Two - way interaction effect analysis demonstrates that diabetes complications interact with each other, which makes healthcare costs and health outcomes different between a case with one complication and a case with two complications. Accordingly, we propose some useful suggestions. For instance, as for how to decrease Medicare payments for outpatients with diabetes, we suggest that the patients should often monitor their blood glucose level. We also recommend that inpatients with diabetes should pay more attention to their kidney disease, and use prevention to avoid such diseases to decrease the costs
Causal Pattern Mining in Highly Heterogeneous and Temporal EHRs Data
University of Minnesota Ph.D. dissertation. March 2017. Major: Computer Science. Advisor: Vipin Kumar. 1 computer file (PDF); ix, 112 pages.The World Health Organization (WHO) estimates that the total healthcare spending in the U.S. is around 18\% of its GDP for the year 2011. Even with such a high per-capita expenditure, the quality of healthcare in U.S. lags behind as compared to the healthcare in other industrialized countries. This inefficient state of the U.S. healthcare system is attributed to the current Fee-for-service (FFS) model. Under the FFS model, healthcare providers (doctors, hospitals) receive payments for every hospital visit or service rendered. The lack of coordination between the service providers and patient outcomes, leads to an increase in the costs associated with the healthcare management, as healthcare providers often recommend expensive treatments. Several legislations have been approved in the recent past to improve the overall U.S. healthcare management while simultaneously reducing the associated costs. The HITECH Act, proposes to spend close to \$30 billion dollars on creating a nationwide repository of electronic Health Records (EHRs). Such a repository would consist of patient attributes such as demographics, laboratories test results, vital information and diagnosis codes. It is hoped that this EHR repository will be a platform to improve care coordination between service providers and patients healthcare outcomes, reduce health disparities thereby improving the overall healthcare management system. Data collected and stored in the EHR (HITECH) and the need to improve care efficiency and outcome (ACT) would help to improve the current state of U.S. healthcare system. Data mining techniques in conjunction with EHRs can be used to develop novel clinical decision making tools, to analyze the prevalence and incidence of diseases and to evaluate the efficacy of existing clinical and surgical interventions. In this thesis we focus on two key aspects of EHR data, i.e. temporality and causation. This becomes more important considering that the temporal nature of EHRs data has not been fully exploited. Further, increasing amounts of clinical evidence suggest that temporal nature is important for the development of clinical decision making tools and techniques. Secondly, several research articles hint at the the presence of antiquated clinical guidelines which are still in practice. In this dissertation, we first describe EHR along with the following terminologies : temporality, causation and heterogeneity. Building on this, we then describe methodologies for extracting non-causal patterns in the absence of longitudinal data. Further, we describe methods to extract non-causal patterns in the presence of longitudinal data. We describe such methodologies in the context of Type-2 Diabetes Mellitus (T2DM). Furthermore, we describe techniques to extract simple and complex causal patterns from longitudinal data in the context of sepsis and T2DM. Finally, we conclude this dissertation, by providing a summary of our work along with future directions
A framework for trend mining with application to medical data
This thesis presents research work conducted in the field of knowledge discovery. It presents an integrated trend-mining framework and SOMA, which is the application of the trend-mining framework in diabetic retinopathy data. Trend mining is the process of identifying and analysing trends in the context of the variation of support of the association/classification rules that have been extracted from longitudinal datasets. The integrated framework concerns all major processes from data preparation to the extraction of knowledge. At the pre-process stage, data are cleaned, transformed if necessary, and sorted into time-stamped datasets using logic rules. At the next stage, time-stamp datasets are passed through the main processing, in which the ARM technique of matrix algorithm is applied to identify frequent rules with acceptable confidence. Mathematical conditions are applied to classify the sequences of support values into trends. Afterwards, interestingness criteria are applied to obtain interesting knowledge, and a visualization technique is proposed that maps how objects are moving from the previous to the next time stamp.
A validation and verification (external and internal validation) framework is described that aims to ensure that the results at the intermediate stages of the framework are correct and that the framework as a whole can yield results that demonstrate causality.
To evaluate the thesis, SOMA was developed.
The dataset is, in itself, also of interest, as it is very noisy (in common with other similar medical datasets) and does not feature a clear association between specific time stamps and subsets of the data. The Royal Liverpool University Hospital has been a major centre for retinopathy research since 1991. Retinopathy is a generic term used to describe damage to the retina of the eye, which can, in the long term, lead to visual loss. Diabetic retinopathy is used to evaluate the framework, to determine whether SOMA can extract knowledge that is already known to the medics. The results show that those datasets can be used to extract knowledge that can show causality between patients’ characteristics such as the age of patient at diagnosis, type of diabetes, duration of diabetes, and diabetic retinopathy
Hybridization Of Optimized Support Vector Machine And Artificial Neural Network For The Diabetic Retinopathy Classification Problem
Diabetic Retinopathy (DR) is one of the most threatening disease which caused blindness
for diabetic patient. With the increasing number of DR cases nowadays, diabetic eye
screening has become a challenging task for ophthalmologist as they need to deal with a large
number of retinal image to be diagnosed every day. Screening and early detection of DR play
a vital role to help reducing the incidence of visual morbidity and vision loss. The screening
task is done manually in most countries using qualitative scale to detect abnormalities on the
retina. Although this approach is useful, the detection is not accurate. Previous researchers
have tried a few attempts to propose an automatic DR classification, however it needs to be
improvised especially in terms of accuracy. A group of literates showed that DR classification
can be performed using the clinical features resulted from the blood test such as glycated
haemoglobin, triglyceride, creatine and glucose value. Even this subject have been studied
previously, but it remains the subject of on-going research. Hence, this research aims to obtain
optimal or near-optimal performance value in the study of diabetic classification using
supervised machine learning. There are many algorithms available for classification purpose
such as k-Nearest Neighbour, k-Means, Support Vector Machine, Decision Tree, Artificial
Neural Network and Linear Discriminant Analysis. Due to the success of many classification
problems been proposed with good result, k-Nearest Neighbour, Artificial Neural Network,
and Support Vector Machine algorithms are used in this research
Machine learning of structured and unstructured healthcare data
The widespread adoption of Electronic Health Records (EHR) systems in healthcare institutions in the United States makes machine learning based on large-scale and real-world clinical data feasible and affordable. Machine learning of healthcare data, or healthcare data analytics, has achieved numerous successes in various applications. However, there are still many challenges for machine learning of healthcare data both structured and unstructured. Longitudinal structured clinical data (e.g., lab test results, diagnoses, and medications) have an enormous variety of categories, are collected at irregularly spaced visits, and are sparsely distributed. Studies on analyzing longitudinal structured EHR data for tasks such as disease prediction and visualization are still limited. For unstructured clinical notes, existing studies mostly focus on disease prediction or cohort selection. Studies on mining clinical notes with the direct purpose to reduce costs for healthcare providers or institutions are limited. To fill in these gaps, this dissertation has three research topics.The first topic is about developing state-of-the-art predictive models to detect diabetic retinopathy using longitudinal structured EHR data. Major deep-learning-based temporal models for disease prediction are studied, implemented, and evaluated. Experimental results on a large-scale dataset show that temporal deep learning models outperform non-temporal random forests models in terms of AUPRC and recall.The second topic is about clustering temporal disease networks to visualize comorbidity progression. We propose a clustering technique to outline comorbidity progression phases as well as a new disease clustering method to simplify the visualization. Two case studies on Clostridioides difficile and stroke show the methods are effective.The third topic is clinical information extraction for medical billing. We propose a framework that consists of two methods, a rule-based and a deep-learning-based, to extract patient history information directly from clinical notes to facilitate the Evaluation and Management Services (E/M) billing. Initial results of the two prototype systems on an annotated dataset are promising and direct us for potential improvements
Automatic Screening and Classification of Diabetic Retinopathy Eye Fundus Image
Diabetic Retinopathy (DR) is a disorder of the retinal vasculature. It develops to some degree in nearly all patients with long-standing diabetes mellitus and can result in blindness. Screening of DR is essential for both early detection and early treatment. This thesis aims to investigate automatic methods for diabetic retinopathy detection and subsequently develop an effective system for the detection and screening of diabetic retinopathy.
The presented diabetic retinopathy research involves three development stages. Firstly, the thesis presents the development of a preliminary classification and screening system for diabetic retinopathy using eye fundus images. The research will then focus on the detection of the earliest signs of diabetic retinopathy, which are the microaneurysms. The detection of microaneurysms at an early stage is vital and is the first step in preventing diabetic retinopathy. Finally, the thesis will present decision support systems for the detection of diabetic retinopathy and maculopathy in eye fundus images. The detection of maculopathy, which are yellow lesions near the macula, is essential as it will eventually cause the loss of vision if the affected macula is not treated in time.
An accurate retinal screening, therefore, is required to assist the retinal screeners to classify the retinal images effectively. Highly efficient and accurate image processing techniques must thus be used in order to produce an effective screening of diabetic retinopathy. In addition to the proposed diabetic retinopathy detection systems, this thesis will present a new dataset, and will highlight the dataset collection, the expert diagnosis process and the advantages of the new dataset, compared to other public eye fundus images datasets available. The new dataset will be useful to researchers and practitioners working in the retinal imaging area and would widely encourage comparative studies in the field of diabetic retinopathy research. It is envisaged that the proposed decision support system for clinical screening would greatly contribute to and assist the management and the detection of diabetic retinopathy. It is also hoped that the developed automatic detection techniques will assist clinicians to diagnose diabetic retinopathy at an early stage
Non-communicable Diseases, Big Data and Artificial Intelligence
This reprint includes 15 articles in the field of non-communicable Diseases, big data, and artificial intelligence, overviewing the most recent advances in the field of AI and their application potential in 3P medicine
Evaluation of strategies for reducing the burden of COPD in the UK using Bayesian methods
Chronic obstructive pulmonary disease (COPD) is responsible for 5.3% of all deaths and 1.7% of all hospital admissions in the UK. This thesis focuses on strategies to reduce COPD burden by targeting three aspects across the public healthcare system: prevention, emergency treatment, and long-term management. Analyses were performed in a Bayesian framework to exploit its flexibility in modelling uncertainty and the incorporation of prior knowledge.
First, I assessed whether communication of personalised disease risk in primary care is an effective smoking cessation intervention, using cost-effectiveness and value of information analyses based on various data sources across the literature. The odds ratio for the effectiveness of communication of personalised disease risk was 1.48 (95%CrI:0.91-2.26). While I found a probability of cost-effectiveness of about 90%, further research up to a maximum of £27 million is justified to reduce the uncertainty around this estimate.
Secondly, I assessed whether case ascertainment affects the detection of poorly performing hospital trusts in the treatment of acute exacerbation of COPD (AECOPD) in secondary care, using data from the National Asthma and COPD Audit Programme. Case ascertainment was associated with 30-day mortality (OR:1.74; 1.25-2.41) and adjusting for it impacted the findings, with 5 trusts becoming outliers and 2 trusts no longer classified as outliers.
Finally, using general practice data from Clinical Practice Research Datalink, I assessed whether new guidelines suggesting triple therapy (long-acting beta-2 agonists, LABA + long-acting muscarinic antagonists, LAMA + inhaled corticosteroids, ICS) for the treatment of those with poorly-controlled COPD on LABA+LAMA dual therapy improves disease outcomes. Triple therapy was not associated with severe AECOPD (IRR:1.00; 0.93-1.07) or mortality (IRR:0.95; 0.86-1.06), but was associated with increased risk of pneumonia (IRR:1.19; 1.05-1.35).
This thesis applied sophisticated Bayesian methods to increase understanding of how COPD burden could be reduced in different areas of the public healthcare system.Open Acces
- …