2,197 research outputs found

    Health data driven on continuous blood pressure prediction based on gradient boosting decision tree algorithm

    Get PDF
    Diseases related to issues with blood pressure are becoming a major threat to human health. With the development of telemedicine monitoring applications, a growing number of corresponding devices are being marketed, such as the use of remote monitoring for the purposes of increasing the autonomy of the elderly and thus encouraging a healthier and longer health span. Using machine learning algorithms to measure blood pressure at a continuous rate is a feasible way to provide models and analysis for telemedicine monitoring data and predicting blood pressure. For this paper, we applied the gradient boosting decision tree (GBDT) while predicting blood pressure rates based on the human physiological data collected by the EIMO device. EIMO equipment-specific signal acquisition includes ECG and PPG. In order to avoid over-fitting, the optimal parameters are selected via the cross-validation method. Consequently, our method has displayed a higher accuracy rate and better performance in calculating the mean absolute error evaluation index than methods, such as the traditional least squares method, ridge regression, lasso regression, ElasticNet, SVR, and KNN algorithm. When predicting the blood pressure of a single individual, calculating the systolic pressure displays an accuracy rate of above 70% and above 64% for calculating the diastolic pressure with GBDT, with the prediction time being less than 0.1 s. In conclusion, applying the GBDT is the best method for predicting the blood pressure of multiple individuals: with the inclusion of data such as age, body fat, ratio, and height, algorithm accuracy improves, which in turn indicates that the inclusion of new features aids prediction performance

    Anwendung von Machine-Learning in medizinischer Diagnostik in der Geburtshilfe

    Get PDF
    Background Improvements in computational capacity and new algorithmic approaches to data analysis have created enormous opportunities to improve conventional diagnostics in the hospital in recent years. Especially obstetrics, a speciality with high-dimensional data and limited performances in their conventional diagnostic approaches for many adverse outcomes in pregnancy, stands to benefit greatly from the application of machine-learning. This dissertation intends to present our own work which predicts the occurrence of adverse outcomes in preeclampsia high-risk-pregnancies and to contextualise it with the current state of research for the application of machine-learning in preeclampsia as well as other obstetric/gynecologic conditions in general. Methods The presented study is based on a patient collective of 1647 women which presented to the obstetric department of the Charité Universitätsmedizin Berlin between July 2010 and March 2019. We determined predictive performance of different machine-learning algorithms (Gradient boosted trees, Random Forest) for adverse outcomes commonly associated with preeclampsia and compared them to models based on laboratory and vital parameter cutoffs (blood pressure, sFlt-1/PlGF ratio and their combination with proteinuria measurements) used in the clinic. Dataset splitting was performed in a per-patient randomised fashion using a 90-10 split and evaluation was performed using a 10x10-fold cross-validation approach. Results Our own study showed gains in predictive performance when using machine-learning models. Accuracy for gradient boosted trees was 87 ± 3 % while blood pressure cutoffs achieved only 65 ± 4 % and a cutoff of 38 applied to the sFlt-1/PlGF-ratio yielded an accuracy of 68 ± 5 %. The positive predictive value especially improved from 33 ± 9 % for the blood-pressure-cutoffs to 82 ± 10 % for the gradient-boosted trees classifier with the “full clinical model” consisting of blood pressure, sFlt-1/PlGF ratio and proteinuria achieving 44 ± 9 % PPV. Overall we found that using machine-learning methods leads to great improvements in all assessed performance metrics with potential for further enhancement using optimization on the algorithms’ output probabilities’ cutoffs. Conclusions Machine-learning greatly improves the diagnostic capabilities for preeclampsia and, as shown by many other works in this dissertation, obstetrics/gynaecology and medicine in general. This could represent a starting point for further research which leads to more sophisticated diagnostic or decision-support tools.Einleitung Verbesserungen in Rechenkapazitäten und neue algorithmische Ansätze der Datenanalyse haben große Möglichkeiten zur Verbesserung konventioneller Diagnostik in Krankenhäusern über die letzten Jahre kreiert. Besonders die Geburtshilfe, eine Fachrichtung mit hochdimensionalen Datensätzen und limitierter Performance der konventionellen diagnostischen Methoden für viele der adversen Events in der Schwangerschaft, kann stark von der Anwendung von Machine-Learning profitieren. Diese Dissertation beabsichtigt unsere eigene Arbeit, welche das Auftreten adverser Events in Präeklampsie-Hochrisikoschwangerschaften vorhersagt, vorzustellen und mit dem aktuellen stand der Forschung für Machine-Learning in der Präeklampsie sowie Gynäkologie/Geburtshilfe in Kontext zu setzen. Methoden Die vorgestellte Studie basiert auf einer Patientinnengruppe von 1647 Frauen, die sich zwischen Juli 2010 und März 2019 in der Klinik für Geburtsmedizin der Charité Universitätsmedizin Berlin vorstellten. Wir untersuchten die Leistung verschiedener Machine-Learning-Algorithmen (Gradient Boosted Trees, Random Forest) zur Vorhersage häufig mit Präeklampsie assoziierter adverser Events und verglichen diese mit Modellen basierend auf klinisch angewendeten Labor- und Vitalparameter-Grenzwerten (Blutdruck, sFlt-1/PlGF-Ratio und ihre Kombination mit Proteinurie-Messungen). Der Datensatz wurde auf einer randomisierten Pro-Patient-Basis in einem 90-10-split in Trainings- und Testsatz geteilt und mittels einer 10x 10-fachen Kreuzvalidierung evaluiert. Ergebnisse Unsere Studie zeigte Zugewinne an prädiktiver Leistung durch Nutzung von Machine-Learning-Modellen. Genauigkeit für Gradient boosted trees war 87 ± 3 %, während Blutdruckgrenzwerte lediglich 65 ± 4 % erreichen konnten und ein Grenzwert von 38 der sFlt-1/PLGF-Ratio eine Genauigkeit von 68 ± 5 %. Insbesondere der positiv prädiktive Wert verbesserte sich von 33 ± 9 % für den Blutdruckgrenzwert auf 82 ± 10 % für den Gradient-boosted Trees-Klassifizierer, während das “vollständige” klinische Modell bestehend aus Blutdruck, sFlt-1/PlGF-Ratio und Proteinurie 44 ± 9 % erreichen konnte. Insgesamt fanden wir, dass Machine-Learning Methoden zu großen Verbesserungen in allen untersuchten Performance-Metriken führt, mit Potential zu weiteren Verbesserungen durch Optimierung von Grenzwerten auf den ausgegebenen Wahrscheinlichkeiten der Modelle. Schlussfolgerung Machine-Learning führt zu immensen Verbesserungen der diagnostischen Möglichkeiten für Präeklampsie und, wie durch viele weitere Arbeiten in dieser Dissertation gezeigt, Gynäkologie/Geburtshilfe und Medizin im Allgemeinen. Dies kann einen Startpunkt für weitere Forschung repräsentieren, welche zu anspruchsvolleren Diagnostik- und Entscheidung-Support-Werkzeugen führt

    Machine learning for the classification of atrial fibrillation utilizing seismo- and gyrocardiogram

    Get PDF
    A significant number of deaths worldwide are attributed to cardiovascular diseases (CVDs), accounting for approximately one-third of the total mortality in 2019, with an estimated 18 million deaths. The prevalence of CVDs has risen due to the increasing elderly population and improved life expectancy. Consequently, there is an escalating demand for higher-quality healthcare services. Technological advancements, particularly the use of wearable devices for remote patient monitoring, have significantly improved the diagnosis, treatment, and monitoring of CVDs. Atrial fibrillation (AFib), an arrhythmia associated with severe complications and potential fatality, necessitates prolonged monitoring of heart activity for accurate diagnosis and severity assessment. Remote heart monitoring, facilitated by ECG Holter monitors, has become a popular approach in many cardiology clinics. However, in the absence of an ECG Holter monitor, other remote and widely available technologies can prove valuable. The seismo- and gyrocardiogram signals (SCG and GCG) provide information about the mechanical function of the heart, enabling AFib monitoring within or outside clinical settings. SCG and GCG signals can be conveniently recorded using smartphones, which are affordable and ubiquitous in most countries. This doctoral thesis investigates the utilization of signal processing, feature engineering, and supervised machine learning techniques to classify AFib using short SCG and GCG measurements captured by smartphones. Multiple machine learning pipelines are examined, each designed to address specific objectives. The first objective (O1) involves evaluating the performance of supervised machine learning classifiers in detecting AFib using measurements conducted by physicians in a clinical setting. The second objective (O2) is similar to O1, but this time utilizing measurements taken by patients themselves. The third objective (03) explores the performance of machine learning classifiers in detecting acute decompensated heart failure (ADHF) using the same measurements as O1, which were primarily collected for AFib detection. Lastly, the fourth objective (O4) delves into the application of deep neural networks for automated feature learning and classification of AFib. These investigations have shown that AFib detection is achievable by capturing a joint SCG and GCG recording and applying machine learning methods, yielding satisfactory performance outcomes. The primary focus of the examined approaches encompassed (1) feature engineering coupled with supervised classification, and (2) iv automated end-to-end feature learning and classification using deep convolutionalrecurrent neural networks. The key finding from these studies is that SCG and GCG signals reliably capture the heart’s beating pattern, irrespective of the operator. This allows for the detection of irregular rhythm patterns, making this technology suitable for monitoring AFib episodes outside of hospital settings as a remote monitoring solution for individuals suspected to have AFib. This thesis demonstrates the potential of smartphone-based AFib detection using built-in inertial sensors. Notably, a short recording duration of 10 to 60 seconds yields clinically relevant results. However, it is important to recognize that the results for ADHF did not match the state-of-the-art achievements due to the limited availability of ADHF data combined with arrhythmias as well as the lack of a cardiopulmonary exercise test in the measurement setting. Finally, it is important to recognize that SCG and GCG are not intended to replace clinical ECG measurements or long-term ambulatory Holter ECG recordings. Instead, within the scope of our current understanding, they should be regarded as complementary and supplementary technologies for cardiovascular monitoring

    Predicting serum levels of lithium-treated patients: A supervised machine learning approach

    Get PDF
    Routine monitoring of lithium levels is common clinical practice. This is because the lithium prediction strategies available developed by previous studies are still limited due to insufficient prediction performance. Thus, we used machine learning approaches to predict lithium concentration in a large real-world dataset. Real-world data from multicenter electronic medical records were used in different machine learning algorithms to predict: (1) whether the serum level was 0.6-1.2 mmol/L or 0.0-0.6 mmol/L (binary prediction), and (2) its concentration value (continuous prediction). We developed models from 1505 samples through 5-fold cross-validation and used 204 independent samples to test their performance by evaluating their accuracy. Moreover, we ranked the most important clinical features in different models and reconstructed three reduced models with fewer clinical features. For binary and continuous predictions, the average accuracy of these models was 0.70-0.73 and 0.68-0.75, respectively. Seven features were listed as important features related to serum lithium levels of 0.6-1.2 mmol/L or higher lithium concentration, namely older age, lower systolic blood pressure, higher daily and last doses of lithium prescription, concomitant psychotropic drugs with valproic acid and -pine drugs, and comorbid substance-related disorders. After reducing the features in the three new predictive models, the binary or continuous models still had an average accuracy of 0.67-0.74. Machine learning processes complex clinical data and provides a potential tool for predicting lithium concentration. This may help in clinical decision-making and reduce the frequency of serum level monitoring

    Predictive analytics framework for electronic health records with machine learning advancements : optimising hospital resources utilisation with predictive and epidemiological models

    Get PDF
    The primary aim of this thesis was to investigate the feasibility and robustness of predictive machine-learning models in the context of improving hospital resources’ utilisation with data- driven approaches and predicting hospitalisation with hospital quality assessment metrics such as length of stay. The length of stay predictions includes the validity of the proposed methodological predictive framework on each hospital’s electronic health records data source. In this thesis, we relied on electronic health records (EHRs) to drive a data-driven predictive inpatient length of stay (LOS) research framework that suits the most demanding hospital facilities for hospital resources’ utilisation context. The thesis focused on the viability of the methodological predictive length of stay approaches on dynamic and demanding healthcare facilities and hospital settings such as the intensive care units and the emergency departments. While the hospital length of stay predictions are (internal) healthcare inpatients outcomes assessment at the time of admission to discharge, the thesis also considered (external) factors outside hospital control, such as forecasting future hospitalisations from the spread of infectious communicable disease during pandemics. The internal and external splits are the thesis’ main contributions. Therefore, the thesis evaluated the public health measures during events of uncertainty (e.g. pandemics) and measured the effect of non-pharmaceutical intervention during outbreaks on future hospitalised cases. This approach is the first contribution in the literature to examine the epidemiological curves’ effect using simulation models to project the future hospitalisations on their strong potential to impact hospital beds’ availability and stress hospital workflow and workers, to the best of our knowledge. The main research commonalities between chapters are the usefulness of ensembles learning models in the context of LOS for hospital resources utilisation. The ensembles learning models anticipate better predictive performance by combining several base models to produce an optimal predictive model. These predictive models explored the internal LOS for various chronic and acute conditions using data-driven approaches to determine the most accurate and powerful predicted outcomes. This eventually helps to achieve desired outcomes for hospital professionals who are working in hospital settings

    A New Scalable, Portable, and Memory-Efficient Predictive Analytics Framework for Predicting Time-to-Event Outcomes in Healthcare

    Get PDF
    Time-to-event outcomes are prevalent in medical research. To handle these outcomes, as well as censored observations, statistical and survival regression methods are widely used based on the assumptions of linear association; however, clinicopathological features often exhibit nonlinear correlations. Machine learning (ML) algorithms have been recently adapted to effectively handle nonlinear correlations. One drawback of ML models is that they can model idiosyncratic features of a training dataset. Due to this overlearning, ML models perform well on the training data but are not so striking on test data. The features that we choose indirectly influence the performance of ML prediction models. With the expansion of big data in biomedical informatics, appropriate feature engineering and feature selection are vital to ML success. Also, an ensemble learning algorithm helps decrease bias and variance by combining the predictions of multiple models. In this study, we newly constructed a scalable, portable, and memory-efficient predictive analytics framework, fitting four components (feature engineering, survival analysis, feature selection, and ensemble learning) together. Our framework first employs feature engineering techniques, such as binarization, discretization, transformation, and normalization on raw dataset. The normalized feature set was applied to the Cox survival regression that produces highly correlated features relevant to the outcome.The resultant feature set was deployed to “eXtreme gradient boosting ensemble learning” (XGBoost) and Recursive Feature Elimination algorithms. XGBoost uses a gradient boosting decision tree algorithm in which new models are created sequentially that predict the residuals of prior models, which are then added together to make the final prediction. In our experiments, we analyzed a cohort of cardiac surgery patients drawn from a multi-hospital academic health system. The model evaluated 72 perioperative variables that impact an event of readmission within 30 days of discharge, derived 48 significant features, and demonstrated optimum predictive ability with feature sets ranging from 16 to 24. The area under the receiver operating characteristics observed for the feature set of 16 were 0.8816, and 0.9307 at the 35th, and 151st iteration respectively. Our model showed improved performance compared to state-of-the-art models and could be more useful for decision support in clinical settings
    • …
    corecore