15 research outputs found

    An Improved Model Ensembled of Different Hyper-parameter Tuned Machine Learning Algorithms for Fetal Health Prediction

    Full text link
    Fetal health is a critical concern during pregnancy as it can impact the well-being of both the mother and the baby. Regular monitoring and timely interventions are necessary to ensure the best possible outcomes. While there are various methods to monitor fetal health in the mother's womb, the use of artificial intelligence (AI) can improve the accuracy, efficiency, and speed of diagnosis. In this study, we propose a robust ensemble model called ensemble of tuned Support Vector Machine and ExtraTrees (ETSE) for predicting fetal health. Initially, we employed various data preprocessing techniques such as outlier rejection, missing value imputation, data standardization, and data sampling. Then, seven machine learning (ML) classifiers including Support Vector Machine (SVM), XGBoost (XGB), Light Gradient Boosting Machine (LGBM), Decision Tree (DT), Random Forest (RF), ExtraTrees (ET), and K-Neighbors were implemented. These models were evaluated and then optimized by hyperparameter tuning using the grid search technique. Finally, we analyzed the performance of our proposed ETSE model. The performance analysis of each model revealed that our proposed ETSE model outperformed the other models with 100% precision, 100% recall, 100% F1-score, and 99.66% accuracy. This indicates that the ETSE model can effectively predict fetal health, which can aid in timely interventions and improve outcomes for both the mother and the baby.Comment: 23 pages, 6 Tables, 5 Figure

    Prediction of the mode of delivery using artificial intelligence algorithms

    Get PDF
    Background and objective: Mode of delivery is one of the issues that most concerns obstetricians. The caesarean section rate has increased progressively in recent years, exceeding the limit recommended by health institutions. Obstetricians generally lack the necessary technology to help them decide whether a caesarean delivery is appropriate based on antepartum and intrapartum conditions. Methods: In this study, we have tested the suitability of using three popular artificial intelligence algorithms, Support Vector Machines, Multilayer Perceptron and, Random Forest, to develop a clinical decision support system for the prediction of the mode of delivery according to three categories: caesarean section, euthocic vaginal delivery and, instrumental vaginal delivery. For this purpose, we used a comprehensive clinical database consisting of 25038 records with 48 attributes of women who attended to give birth at the Service of Obstetrics and Gynaecology of the University Clinical Hospital "Virgen de la Arrixaca" in the Murcia Region (Spain) from January of 2016 to January 2019. Women involved were patients with singleton pregnancies who attended to the emergency room on active labour or undergoing a planned induction of labour for medical reasons. Results: The three implemented algorithms showed a similar performance, all of them reaching an accuracy equal to or above 90% in the classification between caesarean and vaginal deliveries and somewhat lower, around 87% between instrumental and euthocic. Conclusions: The results validate the use of these algorithms to build a clinical decision system to help gynaecologists to predict the mode of delivery

    Prediction of Bradycardia using Decision Tree Algorithm and Comparing the Accuracy with Support Vector Machine

    Get PDF
    This study compares the Accuracy of Support Vector Machine (SVM) Classifier and Decision Tree (DT) Classifier in predicting Innovative Bradycardia disease diagnosis. Materials and Methods: There are 7,500 records in the dataset that was used for this investigation. 40 records are utilized in the test to get a 95% confidence level in Accuracy and a 1% margin of error. There are 12 qualities or features per record. Using Decision Tree and SVM, Innovative Bradycardia disease is detected. Results: According to the statistical analysis, the Accuracy of the Decision Tree Classifier was 92.62%, P<0.05, and the Accuracy of the SVM was 87.5%, P<0.05. The p value was calculated as 0.001 (p<0.05, independent sample t-test indicating a statistically significant difference in the accuracy rates between the two algorithms (SVM and DT). Conclusion: In the Innovative Bradycardia prediction task, the Decision Tree Classifier (92.5%) exhibited a significant improvement over the SVM (87.5%), as demonstrated by the findings of the present study

    Machine learning on cardiotocography data to classify fetal outcomes: A scoping review

    Get PDF
    Introduction: Uterine contractions during labour constrict maternal blood flow and oxygen delivery to the developing baby, causing transient hypoxia. While most babies are physiologically adapted to withstand such intrapartum hypoxia, those exposed to severe hypoxia or with poor physiological reserves may experience neurological injury or death during labour. Cardiotocography (CTG) monitoring was developed to identify babies at risk of hypoxia by detecting changes in fetal heart rate (FHR) patterns. CTG monitoring is in widespread use in intrapartum care for the detection of fetal hypoxia, but the clinical utility is limited by a relatively poor positive predictive value (PPV) of an abnormal CTG and significant inter and intra observer variability in CTG interpretation. Clinical risk and human factors may impact the quality of CTG interpretation. Misclassification of CTG traces may lead to both under-treatment (with the risk of fetal injury or death) or over-treatment (which may include unnecessary operative interventions that put both mother and baby at risk of complications). Machine learning (ML) has been applied to this problem since early 2000 and has shown potential to predict fetal hypoxia more accurately than visual interpretation of CTG alone. To consider how these tools might be translated for clinical practice, we conducted a review of ML techniques already applied to CTG classification and identified research gaps requiring investigation in order to progress towards clinical implementation. Materials and method: We used identified keywords to search databases for relevant publications on PubMed, EMBASE and IEEE Xplore. We used Preferred Reporting Items for Systematic Review and Meta-Analysis for Scoping Reviews (PRISMA-ScR). Title, abstract and full text were screened according to the inclusion criteria. Results: We included 36 studies that used signal processing and ML techniques to classify CTG. Most studies used an open-access CTG database and predominantly used fetal metabolic acidosis as the benchmark for hypoxia with varying pH levels. Various methods were used to process and extract CTG signals and several ML algorithms were used to classify CTG. We identified significant concerns over the practicality of using varying pH levels as the CTG classification benchmark. Furthermore, studies needed to be more generalised as most used the same database with a low number of subjects for an ML study. Conclusion: ML studies demonstrate potential in predicting fetal hypoxia from CTG. However, more diverse datasets, standardisation of hypoxia benchmarks and enhancement of algorithms and features are needed for future clinical implementation.</p

    From Theory to Practice: A Data Quality Framework for Classification Tasks

    Get PDF
    The data preprocessing is an essential step in knowledge discovery projects. The experts affirm that preprocessing tasks take between 50% to 70% of the total time of the knowledge discovery process. In this sense, several authors consider the data cleaning as one of the most cumbersome and critical tasks. Failure to provide high data quality in the preprocessing stage will significantly reduce the accuracy of any data analytic project. In this paper, we propose a framework to address the data quality issues in classification tasks DQF4CT. Our approach is composed of: (i) a conceptual framework to provide the user guidance on how to deal with data problems in classification tasks; and (ii) an ontology that represents the knowledge in data cleaning and suggests the proper data cleaning approaches. We presented two case studies through real datasets: physical activity monitoring (PAM) and occupancy detection of an office room (OD). With the aim of evaluating our proposal, the cleaned datasets by DQF4CT were used to train the same algorithms used in classification tasks by the authors of PAM and OD. Additionally, we evaluated DQF4CT through datasets of the Repository of Machine Learning Databases of the University of California, Irvine (UCI). In addition, 84% of the results achieved by the models of the datasets cleaned by DQF4CT are better than the models of the datasets authors.This work has also been supported by: Project: “Red de formación de talento humano para la innovación social y productiva en el Departamento del Cauca InnovAcción Cauca”. Convocatoria 03-2018 Publicación de artículos en revistas de alto impacto. Project: “Alternativas Innovadoras de Agricultura Inteligente para sistemas productivos agrícolas del departamento del Cauca soportado en entornos de IoT - ID 4633” financed by Convocatoria 04C–2018 “Banco de Proyectos Conjuntos UEES-Sostenibilidad” of Project “Red de formación de talento humano para la innovación social y productiva en el Departamento del Cauca InnovAcción Cauca”. Spanish Ministry of Economy, Industry and Competitiveness (Projects TRA2015-63708-R and TRA2016-78886-C3-1-R)

    Cardiotocography Signal Abnormality Detection based on Deep Unsupervised Models

    Full text link
    Cardiotocography (CTG) is a key element when it comes to monitoring fetal well-being. Obstetricians use it to observe the fetal heart rate (FHR) and the uterine contraction (UC). The goal is to determine how the fetus reacts to the contraction and whether it is receiving adequate oxygen. If a problem occurs, the physician can then respond with an intervention. Unfortunately, the interpretation of CTGs is highly subjective and there is a low inter- and intra-observer agreement rate among practitioners. This can lead to unnecessary medical intervention that represents a risk for both the mother and the fetus. Recently, computer-assisted diagnosis techniques, especially based on artificial intelligence models (mostly supervised), have been proposed in the literature. But, many of these models lack generalization to unseen/test data samples due to overfitting. Moreover, the unsupervised models were applied to a very small portion of the CTG samples where the normal and abnormal classes are highly separable. In this work, deep unsupervised learning approaches, trained in a semi-supervised manner, are proposed for anomaly detection in CTG signals. The GANomaly framework, modified to capture the underlying distribution of data samples, is used as our main model and is applied to the CTU-UHB dataset. Unlike the recent studies, all the CTG data samples, without any specific preferences, are used in our work. The experimental results show that our modified GANomaly model outperforms state-of-the-arts. This study admit the superiority of the deep unsupervised models over the supervised ones in CTG abnormality detection

    Classification of Caesarean Section and Normal Vaginal Deliveries Using Foetal Heart Rate Signals and Advanced Machine Learning Algorithms

    Get PDF
    ABSTRACT – Background: Visual inspection of Cardiotocography traces by obstetricians and midwives is the gold standard for monitoring the wellbeing of the foetus during antenatal care. However, inter- and intra-observer variability is high with only a 30% positive predictive value for the classification of pathological outcomes. This has a significant negative impact on the perinatal foetus and often results in cardio-pulmonary arrest, brain and vital organ damage, cerebral palsy, hearing, visual and cognitive defects and in severe cases, death. This paper shows that using machine learning and foetal heart rate signals provides direct information about the foetal state and helps to filter the subjective opinions of medical practitioners when used as a decision support tool. The primary aim is to provide a proof-of-concept that demonstrates how machine learning can be used to objectively determine when medical intervention, such as caesarean section, is required and help avoid preventable perinatal deaths. Methodology: This is evidenced using an open dataset that comprises 506 controls (normal virginal deliveries) and 46 cases (caesarean due to pH ≤7.05 and pathological risk). Several machine-learning algorithms are trained, and validated, using binary classifier performance measures. Results: The findings show that deep learning classification achieves Sensitivity = 94%, Specificity = 91%, Area under the Curve = 99%, F-Score = 100%, and Mean Square Error = 1%. Conclusions: The results demonstrate that machine learning significantly improves the efficiency for the detection of caesarean section and normal vaginal deliveries using foetal heart rate signals compared with obstetrician and midwife predictions and systems reported in previous studies

    Applications Of Machine Learning In Biology And Medicine

    Get PDF
    Machine learning as a field is defined to be the set of computational algorithms that improve their performance by assimilating data. As such, the field as a whole has found applications in many diverse disciplines from robotics and communication in engineering to economics and finance, and also biology and medicine. It should not come as a surprise that many popular methods in use today have completely different origins. Despite this heterogeneity, different methods can be divided into standard tasks, such as supervised, unsupervised, semi-supervised and reinforcement learning. Although machine learning as a field can be formalized as methods trying to solve certain standard tasks, applying these tasks on datasets from different fields comes with certain caveats, and sometimes is fraught with challenges. In this thesis, we develop general procedures and novel solutions, dealing with practical problems that arise when modeling biological and medical data. Cost sensitive learning is an important area of research in machine learning which addresses the widespread and practical problem of dealing with different costs during the learning and deployment of classification algorithms. In many applications such as credit fraud detection, network intrusion and specifically medical diagnosis domains, prior class distributions are highly skewed, which makes the training examples very much unbalanced. Combining this with uneven misclassification costs renders standard machine learning approaches useless in learning an acceptable decision function. We experimentally show the benefits and shortcomings of various methods that convert cost blind learning algorithms to cost sensitive ones. Using the results and best practices found for cost sensitive learning, we design and develop a machine learning approach to ontology mapping. Next, we present a novel approach to deal with uncertainty in classification when costs are unknown or otherwise hard to assign. Support Vector Machines (SVM) are considered to be among the most successful approaches for classification. However prediction of instances near the decision boundary depends more on the specific parameter selection or noise in data, rather than a clear difference in features. In many applications such as medical diagnosis, these regions should be labeled as uncertain rather than assigned to any particular class. Furthermore, instances may belong to novel disease subtypes that are not from any previously known class. In such applications, declining to make a prediction could be beneficial when more powerful but expensive tests are available. We develop a novel approach for optimal selection of the threshold and show its successful application on three biological and medical datasets. The last part of this thesis provides novel solutions for handling high dimensional data. Although high-dimensional data is ubiquitously found in many disciplines, current life science research almost always involves high-dimensional genomics/proteomics data. The ``omics\u27\u27 data provide a wealth of information and have changed the research landscape in biology and medicine. However, these data are plagued with noise, redundancy and collinearity, which makes the discovery process very difficult and costly. Any method that can accurately detect irrelevant and noisy variables in omics data would be highly valuable. We present Robust Feature Selection (RFS), a randomized feature selection approach dedicated to low-sample high-dimensional data. RFS combines an embedded feature selection method with a randomization procedure for stability. Recent advances in sparse recovery and estimation methods have provided efficient and asymptotically consistent feature selection algorithms. However, these methods lack finite sample error control due to instability. Furthermore, the chances of correct recovery diminish with more collinearity among features. To overcome these difficulties, RFS uses a randomization procedure to provide an accurate and stable feature selection method. We thoroughly evaluate RFS by comparing it to a number of popular univariate and multivariate feature selection methods and show marked prediction accuracy improvement of a diagnostic signature, while preserving a good stability

    Using Stacked Generalization for Anomaly Detection

    Get PDF
    Deteção de Anomalias é uma área de investigação importante hoje em dia, na qual a intenção é encontrar padrões em dados que não estejam de acordo com o comportamento esperado.As técnicas que têm sido utilizadas nesta área são diversas, baseadas em diferentes assunções sobre como as anomalias se manifestam nos dados e podem ter diferentes resultados (uma pontuação numérica ou uma classificação). Devido a esta heterogeneidade, cada técnica é especializada em características específicas dos dados e pode apenas fornecer uma visão limitada sobre as anomalias que existem num conjunto de dados específico.Ensemble Learning é um processo que tenta incorporar as opiniões de diferentes algoritmos de modo a potenciar uma decisão mais ponderada.Este processo tem sido aplicado com sucesso em problemas de aprendizagem supervisionada e melhorias na performance foram observadas empiricamente.Stacked Generalization é um destes métodos, no qual um algoritmo de aprendizagem é usado para combinar as opiniões de diferentes algoritmos.O objetivo desta tese é o de investigar a aplicação do método Stacked Generalization com as principais técnicas de Deteção de Anomalias e determinar se este método pode conduzir a uma melhor performance.Este método será então avaliado com conjuntos de dados públicos e usados para validação das técnicas na literatura científica de Deteção de Anomalias.Anomaly Detection is an important research topic nowadays, in which the intention is to find patterns in data that do not not conform to expected behavior. This concept is applicable in a large number of different domains and contexts, such as intrusion detection, fraud detection, medical research and social network analysis.Techniques that have been addressed within this topic are diverse, based on different assumptions about how anomalies manifest themselves within the data and can have different outputs (i.e. a numeric score or a labeled classification).Because of this heterogeneity, every technique is specialized in specific characteristics of the data and may only provide a limited insight on what anomalies exist in a given dataset.Ensemble Learning is process that tries to incorporate the opinions of different learners in order to make a more pondered decision.This process has been successfully applied in the past to supervised learning problems and improvements in performance have been empirically observed.Stacked Generalization is one of these methods, in which a learning algorithm is used to combine the different learners.The intention of this thesis is to research the application of Stacked Generalization to current state-of-the-art Anomaly Detection techniques and determine if this method can lead to a better overall performance.These methods will then be evaluated on well-known publicly available datasets used for benchmarking throughout the literature in Anomaly Detection
    corecore