4,684 research outputs found

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Sensor-AssistedWeighted Average Ensemble Model for Detecting Major Depressive Disorder

    Get PDF
    The present methods of diagnosing depression are entirely dependent on self-report ratings or clinical interviews. Those traditional methods are subjective, where the individual may or may not be answering genuinely to questions. In this paper, the data has been collected using self-report ratings and also using electronic smartwatches. This study aims to develop a weighted average ensemble machine learning model to predict major depressive disorder (MDD) with superior accuracy. The data has been pre-processed and the essential features have been selected using a correlation-based feature selection method. With the selected features, machine learning approaches such as Logistic Regression, Random Forest, and the proposedWeighted Average Ensemble Model are applied. Further, for assessing the performance of the proposed model, the Area under the Receiver Optimization Characteristic Curves has been used. The results demonstrate that the proposed Weighted Average Ensemble model performs with better accuracy than the Logistic Regression and the Random Forest approaches

    Application and Analysis of Machine Learning Algorithms on Pima and Early Diabetes Datasets for Diabetes Prediction

    Get PDF
    Diabetes is a chronic condition that strike how your body burns food for energy. Much of the food you consume is converted by your body into sugar (glucose), which is then released into your bloodstream. Your pancreas releases insulin when your blood sugar levels rise. Over the years, several scholars have sought to create reliable diabetes prediction models. Due to a lack of adequate data sets and prediction techniques, this discipline still faces many unsolved research issues, which forces researchers to apply big data analytics and ML-based methodology. Four distinct machine learning algorithms are used in the study to analyze healthcare prediction analytics and solve the issues. In this investigation, the Pima and Early detection datasets were employed. We applied the Decision Tree, MLP, Naive Bayes, and Random Forest algorithms to these datasets and evaluated the accuracy and F-Measure. The goal of this research is to develop a system that could more precisely predict a patient's risk of developing diabetes

    A Comparative Performance Analysis of Hybrid and Classical Machine Learning Method in Predicting Diabetes

    Get PDF
    Diabetes mellitus is one of medical science’s most important research topics because of the disease’s severe consequences. High blood glucose levels characterize it. Early detection of diabetes is made possible by machine learning techniques with their intelligent capabilities to accurately predict diabetes and prevent its complications. Therefore, this study aims to find a machine learning approach that can more accurately predict diabetes. This study compares the performance of various classical machine learning models with the hybrid machine learning approach. The hybrid model includes the homogenous model, which comprises Random Forest, AdaBoost, XGBoost, Extra Trees, Gradient Booster, and the heterogeneous model that uses stacking ensemble methods. The stacking ensemble or stacked generalization approach is a meta-classifier in which multiple learners collaborate for prediction. The performance of the homogeneous hybrid models, Stacked Generalization and the classic machine learning methods such as Naive Bayes and Multilayer Perceptron, k-Nearest Neighbour, and support vector machine are compared. The experimental analysis using Pima Indians and the early-stage diabetes dataset demonstrates that the hybrid models achieve higher accuracy in diagnosing diabetes than the classical models. In the comparison of all the hybrid models, the heterogeneous model using the Stacked Generalization approach outperformed other models by achieving 83.9% and 98.5%. Doi: 10.28991/ESJ-2023-07-01-08 Full Text: PD

    Early Prediction of Gestational Diabetes with Parameter-Tuned K-Nearest Neighbor Classifier

    Get PDF
    Diabetes is one of the quickly spreading chronic diseases causing health complications, such as diabetes retinopathy, kidney failure, and cardiovascular disease. Recently, machine-learning techniques have been widely applied to develop a model for the early prediction of diabetes. Due to its simplicity and generalization capability, K-nearest neighbor (KNN) has been one of the widely employed machine learning techniques for diabetes prediction. Early diabetes prediction has a significant role in managing and preventing complications associated with diabetes, such as retinopathy, kidney failure, and cardiovascular disease. However, the prediction of diabetes in the early stage has remained challenging due to the accuracy and reliability of the KNN model. Thus, gird search hyperparameter optimization is employed to tune the K values of the KNN model to improve its effectiveness in predicting diabetes. The developed hyperparameter-tuned KNN model was tested on the diabetes dataset collected from the UCI machine learning data repository. The dataset contains 768 instances and 8 features. The study applied Min-max scaling to scale the data before fitting it to the KNN model. The result revealed KNN model performance improves when the hyperparameter is tuned.  With hyperparameter tuning, the accuracy of KNN improves by 5.29% accuracy achieving 82.5% overall accuracy for predicting diabetes in the early stage. Therefore, the developed KNN model applied to clinical decision-making in predicting diabetes at an early stage. The early identification of diabetes could aid in early intervention, personalized treatment plans, or reducing healthcare costs reducing associated risks such as retinopathy, kidney disease, and cardiovascular disease
    • …
    corecore