3,053 research outputs found

    A Novel Clustering-Based Algorithm for Continuous and Non-invasive Cuff-Less Blood Pressure Estimation

    Full text link
    Extensive research has been performed on continuous, non-invasive, cuffless blood pressure (BP) measurement using artificial intelligence algorithms. This approach involves extracting certain features from physiological signals like ECG, PPG, ICG, BCG, etc. as independent variables and extracting features from Arterial Blood Pressure (ABP) signals as dependent variables, and then using machine learning algorithms to develop a blood pressure estimation model based on these data. The greatest challenge of this field is the insufficient accuracy of estimation models. This paper proposes a novel blood pressure estimation method with a clustering step for accuracy improvement. The proposed method involves extracting Pulse Transit Time (PTT), PPG Intensity Ratio (PIR), and Heart Rate (HR) features from Electrocardiogram (ECG) and Photoplethysmogram (PPG) signals as the inputs of clustering and regression, extracting Systolic Blood Pressure (SBP) and Diastolic Blood Pressure (DBP) features from ABP signals as dependent variables, and finally developing regression models by applying Gradient Boosting Regression (GBR), Random Forest Regression (RFR), and Multilayer Perceptron Regression (MLP) on each cluster. The method was implemented using the MIMICII dataset with the silhouette criterion used to determine the optimal number of clusters. The results showed that because of the inconsistency, high dispersion, and multi-trend behavior of the extracted features vectors, the accuracy can be significantly improved by running a clustering algorithm and then developing a regression model on each cluster, and finally weighted averaging of the results based on the error of each cluster. When implemented with 5 clusters and GBR, this approach yielded an MAE of 2.56 for SBP estimates and 2.23 for DBP estimates, which were significantly better than the best results without clustering (DBP: 6.27, SBP: 6.36)

    Predicting diabetes-related hospitalizations based on electronic health records

    Full text link
    OBJECTIVE: To derive a predictive model to identify patients likely to be hospitalized during the following year due to complications attributed to Type II diabetes. METHODS: A variety of supervised machine learning classification methods were tested and a new method that discovers hidden patient clusters in the positive class (hospitalized) was developed while, at the same time, sparse linear support vector machine classifiers were derived to separate positive samples from the negative ones (non-hospitalized). The convergence of the new method was established and theoretical guarantees were proved on how the classifiers it produces generalize to a test set not seen during training. RESULTS: The methods were tested on a large set of patients from the Boston Medical Center - the largest safety net hospital in New England. It is found that our new joint clustering/classification method achieves an accuracy of 89% (measured in terms of area under the ROC Curve) and yields informative clusters which can help interpret the classification results, thus increasing the trust of physicians to the algorithmic output and providing some guidance towards preventive measures. While it is possible to increase accuracy to 92% with other methods, this comes with increased computational cost and lack of interpretability. The analysis shows that even a modest probability of preventive actions being effective (more than 19%) suffices to generate significant hospital care savings. CONCLUSIONS: Predictive models are proposed that can help avert hospitalizations, improve health outcomes and drastically reduce hospital expenditures. The scope for savings is significant as it has been estimated that in the USA alone, about $5.8 billion are spent each year on diabetes-related hospitalizations that could be prevented.Accepted manuscrip

    Detecting Heart Attacks Using Learning Classifiers

    Get PDF
    Cardiovascular diseases (CVDs) have emerged as a critical global threat to human life. The diagnosis of these diseases presents a complex challenge, particularly for inexperienced doctors, as their symptoms can be mistaken for signs of aging or similar conditions. Early detection of heart disease can help prevent heart failure, making it crucial to develop effective diagnostic techniques. Machine Learning (ML) techniques have gained popularity among researchers for identifying new patients based on past data. While various forecasting techniques have been applied to different medical datasets, accurate detection of heart attacks in a timely manner remains elusive. This article presents a comprehensive comparative analysis of various ML techniques, including Decision Tree, Support Vector Machines, Random Forest, Extreme Gradient Boosting (XGBoost), Adaptive Boosting, Multilayer Perceptron, Gradient Boosting, K-Nearest Neighbor, and Logistic Regression. These classifiers are implemented and evaluated in Python using data from over 300 patients obtained from the Kaggle cardiovascular repository in CSV format. The classifiers categorize patients into two groups: those with a heart attack and those without. Performance evaluation metrics such as recall, precision, accuracy, and the F1-measure are employed to assess the classifiers’ effectiveness. The results of this study highlight XGBoost classifier as a promising tool in the medical domain for accurate diagnosis, demonstrating the highest predictive accuracy (95.082%) with a calculation time of (0.07995 sec) on the dataset compared to other classifiers

    Explainable Artificial Intelligence Paves the Way in Precision Diagnostics and Biomarker Discovery for the Subclass of Diabetic Retinopathy in Type 2 Diabetics

    Get PDF
    Diabetic retinopathy (DR), a common ocular microvascular complication of diabetes, contributes significantly to diabetes-related vision loss. This study addresses the imperative need for early diagnosis of DR and precise treatment strategies based on the explainable artificial intelligence (XAI) framework. The study integrated clinical, biochemical, and metabolomic biomarkers associated with the following classes: non-DR (NDR), non-proliferative diabetic retinopathy (NPDR), and proliferative diabetic retinopathy (PDR) in type 2 diabetes (T2D) patients. To create machine learning (ML) models, 10% of the data was divided into validation sets and 90% into discovery sets. The validation dataset was used for hyperparameter optimization and feature selection stages, while the discovery dataset was used to measure the performance of the models. A 10-fold cross-validation technique was used to evaluate the performance of ML models. Biomarker discovery was performed using minimum redundancy maximum relevance (mRMR), Boruta, and explainable boosting machine (EBM). The predictive proposed framework compares the results of eXtreme Gradient Boosting (XGBoost), natural gradient boosting for probabilistic prediction (NGBoost), and EBM models in determining the DR subclass. The hyperparameters of the models were optimized using Bayesian optimization. Combining EBM feature selection with XGBoost, the optimal model achieved (91.25 ± 1.88) % accuracy, (89.33 ± 1.80) % precision, (91.24 ± 1.67) % recall, (89.37 ± 1.52) % F1-Score, and (97.00 ± 0.25) % the area under the ROC curve (AUROC). According to the EBM explanation, the six most important biomarkers in determining the course of DR were tryptophan (Trp), phosphatidylcholine diacyl C42:2 (PC.aa.C42.2), butyrylcarnitine (C4), tyrosine (Tyr), hexadecanoyl carnitine (C16) and total dimethylarginine (DMA). The identified biomarkers may provide a better understanding of the progression of DR, paving the way for more precise and cost-effective diagnostic and treatment strategies

    Data Mining

    Get PDF
    Data mining is a branch of computer science that is used to automatically extract meaningful, useful knowledge and previously unknown, hidden, interesting patterns from a large amount of data to support the decision-making process. This book presents recent theoretical and practical advances in the field of data mining. It discusses a number of data mining methods, including classification, clustering, and association rule mining. This book brings together many different successful data mining studies in various areas such as health, banking, education, software engineering, animal science, and the environment

    Sales forecasting in fashion retail chain: A case study in Vietnam

    Get PDF
    Sales forecasting is the key to the success of a supply chain, especially in the fashion industry. Vietnam is a major apparel supplier for international brands and has a dynamic, fast-growing market. The lack of complete forecasting systems in such a market motivates our development of sales forecast procedures featuring quantifiable results and practical implementation. Both statistical and machine learning methods were tested using actual data. We found that Random Forest consistently yields the lowest error metrics, followed closely by XGBoost. Meanwhile, clustering did not provide conclusive evidence of improved accuracy. As the study’s data sources come entirely from one company, these procedures are applicable for other firms to deploy their data without external mining sources.nhhma
    • …
    corecore