7,486 research outputs found

    Developing Artificial Intelligence tools to investigate the phenotypes and correlates of Chronic Kidney Disease patients in West Virginia

    Get PDF
    ABSTRACT Developing Artificial Intelligence tools to investigate the phenotypes and correlates of Chronic Kidney Disease patients in West Virginia Marzieh Amiri Shahbazi Chronic kidney disease (CKD) is responsible for disrupting the lives of 37 million people just in the USA, which is about 1 in 7 adults. CKD results in a gradual loss of kidney function over time. Sometimes CKD doesn’t produce any significant symptoms until it reaches an advanced stage. On the other hand, acute kidney injury (AKI) accounts for a sudden decline in the kidney’s function. As a result, the kidneys fail to filter waste materials from the blood and cause an increase in blood pressure. High blood pressure can cause heart disease and, in the long-term, induce CKD. Literature to date says AKI leads to long-term adverse kidney outcomes and linked to CKD. AKI diagnosis, its severity, treatment, and recovery process have a major impact on the likelihood of a future diagnosis of CKD. This research attempts to understand the patient’s trajectory toward developing CKD after AKI diagnosis, key triggers contributing to this trajectory and ultimately develop an Artificial intelligence-based prognosis tool. To comprehend the role of AKI and previous hospitalization in the progress of CKD, various cohorts of CKD patients are created: i) AKI after hospitalization before CKD ii) Random AKI before CKD, and iii) No AKI before CKD. Prior comorbidities, medications, lab results, and pertinent procedures are considered, and for each cohort of patients, the most prevalent phenotypes are identified. The patient cohorts required for this analysis are generated from CKD patients residing in West Virginia. The data is provided by TriNetx, a global network platform. K-means clustering, and the latent class analysis (LCA) approach is used to identify and group the phenotypes of CKD for each cohort. The high-risk patient groups generated by the clustering algorithms are compared with each other. These results will help clinicians to understand the risk factors of CKD and the overall trajectory of the development of CKD. This research suggests that a single method of care does not work for all patients since phenotypes vary for distinct groups of patients and categorizing patients into distinct groups allows for the allocation of different resources and strategies for the care of different groups of patients. From this research, it is evident that patients’ risk profiles change over the years before developing CKD. There are also similarities as well as differences across the cohorts for each year, which suggests that CKD risk factors may be linked to prior AKI, hospitalization, or inpatient care

    On the Road to Accurate Biomarkers for Cardiometabolic Diseases by Integrating Precision and Gender Medicine Approaches

    Get PDF
    The need to facilitate the complex management of cardiometabolic diseases (CMD) has led to the detection of many biomarkers, however, there are no clear explanations of their role in the prevention, diagnosis or prognosis of these diseases. Molecules associated with disease pathways represent valid disease surrogates and well-fitted CMD biomarkers. To address this challenge, data from multi-omics types (genomics, epigenomics, transcriptomics, proteomics, metabolomics, microbiomics, and nutrigenomics), from human and animal models, have become available. However, individual omics types only provide data on a small part of molecules involved in the complex CMD mechanisms, whereas, here, we propose that their integration leads to multidimensional data. Such data provide a better understanding of molecules related to CMD mechanisms and, consequently, increase the possibility of identifying well-fitted biomarkers. In addition, the application of gender medicine also helps to identify accurate biomarkers according to gender, facilitating a differential CMD management. Accordingly, the impact of gender differences in CMD pathophysiology has been widely demonstrated, where gender is referred to the complex interrelation and integration of sex (as a biological and functional marker of the human body) and psychological and cultural behavior (due to ethnical, social, and religious background). In this review, all these aspects are described and discussed, as well as potential limitations and future directions in this incipient field

    An Optimisation-Driven Prediction Method for Automated Diagnosis and Prognosis

    Get PDF
    open access articleThis article presents a novel hybrid classification paradigm for medical diagnoses and prognoses prediction. The core mechanism of the proposed method relies on a centroid classification algorithm whose logic is exploited to formulate the classification task as a real-valued optimisation problem. A novel metaheuristic combining the algorithmic structure of Swarm Intelligence optimisers with the probabilistic search models of Estimation of Distribution Algorithms is designed to optimise such a problem, thus leading to high-accuracy predictions. This method is tested over 11 medical datasets and compared against 14 cherry-picked classification algorithms. Results show that the proposed approach is competitive and superior to the state-of-the-art on several occasions

    Combinatorial k-means clustering as a machine learning tool applied to diabetes mellitus type 2

    Full text link
    A new original procedure based on k-means clustering is designed to find the most appropriate clinical variables able to efficiently separate into groups similar patients diagnosed with diabetes mellitus type 2 (DMT2) and underlying diseases (arterial hypertonia (AH), ischemic heart disease (CHD), diabetic polyneuropathy (DPNP), and diabetic microangiopathy (DMA)). Clustering is a machine learning tool for discovering structures in datasets. Clustering has been proven to be efficient for pattern recognition based on clinical records. The considered combinatorial k-means procedure explores all possible k-means clustering with a determined number of descriptors and groups. The predetermined conditions for the partitioning were as follows: every single group of patients included patients with DMT2 and one of the underlying diseases; each subgroup formed in such a way was subject to partitioning into three patterns (good health status, medium health status, and degenerated health status); optimal descriptors for each disease and groups. The selection of the best clustering is obtained through the parameter called global variance, defined as the sum of all variance values of all clinical variables of all the clusters. The best clinical parameters are found by minimizing this global variance. This methodology has to identify a set of variables that are assumed to separate each underlying disease efficiently in three different subgroups of patients. The hierarchical clustering obtained for these four underlying diseases could be used to build groups of patients with correlated clinical data. The proposed methodology gives surmised results from complex data based on a relationship with the health status of the group and draws a picture of the prediction rate of the ongoing health status

    Predictive modelling of hospital readmissions in diabetic patients clusters

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceDiabetes is a global public health problem with increasing incidence over the past 10 years. This disease's social and economic impacts are widely assessed worldwide, showing a direct and gradual decrease in the individual's ability to work, a gradual loss in the scale of quality of life and a burden on personal finances. The recurrence of hospitalisation is one of the most significant indexes in measuring the quality of care and the opportunity to optimise resources. Numerous techniques identify the patient who will need to be readmitted, such as LACE and HOSPITAL. The purpose of this study was to use a dataset related to the risk of hospital readmission in patients with Diabetes first to apply a clustering of subgroups by similarity. Then structures a predictive analysis with the main algorithms to identify the methodology of best performance. Numerous approaches were performed to prepare the dataset for these two interventions. The results found in the first phase were two clusters based on the total number of hospital recurrences and others on total administrative costs, with K=3. In the second phase, the best algorithm found was Neural Network 3, with a ROC of 0.68 and a misclassification rate of 0.37. When applied the same algorithm in the clusters, there were no gains in the confidence of the indexes, suggesting that there are no substantial gains in the division of subpopulations since the disease has the same behaviour and needs throughout its development

    IMPROVING CORONARY HEART DISEASE PREDICTION BY OUTLIER ELIMINATION

    Get PDF
    Nowadays, heart disease is the major cause of deaths globally. According to a survey conducted by the World Health Organization, almost 18 million people die of heart diseases (or cardiovascular diseases) every day. So, there should be a system for early detection and prevention of heart disease. Detection of heart disease mostly depends on the huge pathological and clinical data that is quite complex. So, researchers and other medical professionals are showing keen interest in accurate prediction of heart disease.  Heart disease is a general term for a large number of medical conditions related to heart and one of them is the coronary heart disease (CHD). Coronary heart disease is caused by the amassing of plaque on the artery walls. In this paper, various machine learning base and ensemble classifiers have been applied on heart disease dataset for efficient prediction of coronary heart disease. Various machine learning classifiers that have been employed include k-nearest neighbor, multilayer perceptron, multinomial naïve bayes, logistic regression, decision tree, random forest and support vector machine classifiers. Ensemble classifiers that have been used include majority voting, weighted average, bagging and boosting classifiers. The dataset used in this study is obtained from the Framingham Heart Study which is a long-term, ongoing cardiovascular study of people from the Framingham city in Massachusetts, USA. To evaluate the performance of the classifiers, various evaluation metrics including accuracy, precision, recall and f1 score have been used. According to our results, the best accuracy was achieved by logistic regression, random forest, majority voting, weighted average and bagging classifiers but the highest accuracy among these was achieved using weighted average ensemble classifier.&nbsp

    Doctor of Philosophy

    Get PDF
    dissertationTemporal reasoning denotes the modeling of causal relationships between different variables across different instances of time, and the prediction of future events or the explanation of past events. Temporal reasoning helps in modeling and understanding interactions between human pathophysiological processes, and in predicting future outcomes such as response to treatment or complications. Dynamic Bayesian Networks (DBN) support modeling changes in patients' condition over time due to both diseases and treatments, using probabilistic relationships between different clinical variables, both within and across different points in time. We describe temporal reasoning and representation in general and DBN in particular, with special attention to DBN parameter learning and inference. We also describe temporal data preparation (aggregation, consolidation, and abstraction) techniques that are applicable to medical data that were used in our research. We describe and evaluate various data discretization methods that are applicable to medical data. Projeny, an opensource probabilistic temporal reasoning toolkit developed as part of this research, is also described. We apply these methods, techniques, and algorithms to two disease processes modeled as Dynamic Bayesian Networks. The first test case is hyperglycemia due to severe illness in patients treated in the Intensive Care Unit (ICU). We model the patients' serum glucose and insulin drip rates using Dynamic Bayesian Networks, and recommend insulin drip rates to maintain the patients' serum glucose within a normal range. The model's safety and efficacy are proven by comparing it to the current gold standard. The second test case is the early prediction of sepsis in the emergency department. Sepsis is an acute life threatening condition that requires timely diagnosis and treatment. We present various DBN models and data preparation techniques that detect sepsis with very high accuracy within two hours after the patients' admission to the emergency department. We also discuss factors affecting the computational tractability of the models and appropriate optimization techniques. In this dissertation, we present a guide to temporal reasoning, evaluation of various data preparation, discretization, learning and inference methods, proofs using two test cases using real clinical data, an open-source toolkit, and recommend methods and techniques for temporal reasoning in medicine

    Statistical analysis and data mining of Medicare patients with diabetes.

    Get PDF
    The purpose of this dissertation is to find ways to decrease Medicare costs and to study health outcomes of diabetes patients as well as to investigate the influence of Medicare, part D since its introduction in 2006 using the CMS CCW (Chronic Condition Data Warehouse) Data and the MEPS (Medical Expenditure Panel Survey) data. In this dissertation, we introduce pattern recognition analysis into the study of medical characteristics and demographic characteristics of the inpatients who have a higher readmission risk. We also broaden the cost-effectiveness analysis by including medical resources usage when investigating the effects of Medicare, part D. In addition, we apply several statistical linear models such as the generalized linear model and data mining techniques such as the neural network model to study the costs and outcomes of both inpatients and outpatients with diabetes in Medicare. Moreover, some descriptive statistics such as kernel density estimation and survival analysis are also employed. One important conclusion from these analyses is that only diseases and procedures, rather than age are key factors to inpatients\u27 mortality rate. Another important discovery is that at the influence of Medicare part 0, insulin is the most efficient oral anti-diabetes drug treatment and that the drug usage in 2006 is not as stable as that in 2005. We also find that the patients who are discharged to home or hospice are more likely to re-enter the hospital after discharge within 30 days. Two - way interaction effect analysis demonstrates that diabetes complications interact with each other, which makes healthcare costs and health outcomes different between a case with one complication and a case with two complications. Accordingly, we propose some useful suggestions. For instance, as for how to decrease Medicare payments for outpatients with diabetes, we suggest that the patients should often monitor their blood glucose level. We also recommend that inpatients with diabetes should pay more attention to their kidney disease, and use prevention to avoid such diseases to decrease the costs

    Detecting Heart Attacks Using Learning Classifiers

    Get PDF
    Cardiovascular diseases (CVDs) have emerged as a critical global threat to human life. The diagnosis of these diseases presents a complex challenge, particularly for inexperienced doctors, as their symptoms can be mistaken for signs of aging or similar conditions. Early detection of heart disease can help prevent heart failure, making it crucial to develop effective diagnostic techniques. Machine Learning (ML) techniques have gained popularity among researchers for identifying new patients based on past data. While various forecasting techniques have been applied to different medical datasets, accurate detection of heart attacks in a timely manner remains elusive. This article presents a comprehensive comparative analysis of various ML techniques, including Decision Tree, Support Vector Machines, Random Forest, Extreme Gradient Boosting (XGBoost), Adaptive Boosting, Multilayer Perceptron, Gradient Boosting, K-Nearest Neighbor, and Logistic Regression. These classifiers are implemented and evaluated in Python using data from over 300 patients obtained from the Kaggle cardiovascular repository in CSV format. The classifiers categorize patients into two groups: those with a heart attack and those without. Performance evaluation metrics such as recall, precision, accuracy, and the F1-measure are employed to assess the classifiers’ effectiveness. The results of this study highlight XGBoost classifier as a promising tool in the medical domain for accurate diagnosis, demonstrating the highest predictive accuracy (95.082%) with a calculation time of (0.07995 sec) on the dataset compared to other classifiers
    • …
    corecore