6,771 research outputs found
A Powerful Paradigm for Cardiovascular Risk Stratification Using Multiclass, Multi-Label, and Ensemble-Based Machine Learning Paradigms: A Narrative Review
Background and Motivation: Cardiovascular disease (CVD) causes the highest mortality globally. With escalating healthcare costs, early non-invasive CVD risk assessment is vital. Conventional methods have shown poor performance compared to more recent and fast-evolving Artificial Intelligence (AI) methods. The proposed study reviews the three most recent paradigms for CVD risk assessment, namely multiclass, multi-label, and ensemble-based methods in (i) office-based and (ii) stress-test laboratories. Methods: A total of 265 CVD-based studies were selected using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) model. Due to its popularity and recent development, the study analyzed the above three paradigms using machine learning (ML) frameworks. We review comprehensively these three methods using attributes, such as architecture, applications, pro-and-cons, scientific validation, clinical evaluation, and AI risk-of-bias (RoB) in the CVD framework. These ML techniques were then extended under mobile and cloud-based infrastructure. Findings: Most popular biomarkers used were office-based, laboratory-based, image-based phenotypes, and medication usage. Surrogate carotid scanning for coronary artery risk prediction had shown promising results. Ground truth (GT) selection for AI-based training along with scientific and clinical validation is very important for CVD stratification to avoid RoB. It was observed that the most popular classification paradigm is multiclass followed by the ensemble, and multi-label. The use of deep learning techniques in CVD risk stratification is in a very early stage of development. Mobile and cloud-based AI technologies are more likely to be the future. Conclusions: AI-based methods for CVD risk assessment are most promising and successful. Choice of GT is most vital in AI-based models to prevent the RoB. The amalgamation of image-based strategies with conventional risk factors provides the highest stability when using the three CVD paradigms in non-cloud and cloud-based frameworks
Multi-Channel Stochastic Variational Inference for the Joint Analysis of Heterogeneous Biomedical Data in Alzheimer's Disease
The joint analysis of biomedical data in Alzheimer's Disease (AD) is
important for better clinical diagnosis and to understand the relationship
between biomarkers. However, jointly accounting for heterogeneous measures
poses important challenges related to the modeling of the variability and the
interpretability of the results. These issues are here addressed by proposing a
novel multi-channel stochastic generative model. We assume that a latent
variable generates the data observed through different channels (e.g., clinical
scores, imaging, ...) and describe an efficient way to estimate jointly the
distribution of both latent variable and data generative process. Experiments
on synthetic data show that the multi-channel formulation allows superior data
reconstruction as opposed to the single channel one. Moreover, the derived
lower bound of the model evidence represents a promising model selection
criterion. Experiments on AD data show that the model parameters can be used
for unsupervised patient stratification and for the joint interpretation of the
heterogeneous observations. Because of its general and flexible formulation, we
believe that the proposed method can find important applications as a general
data fusion technique.Comment: accepted for presentation at MLCN 2018 workshop, in Conjunction with
MICCAI 2018, September 20, Granada, Spai
Decision trees and multi-level ensemble classifiers for neurological diagnostics
Cardiac autonomic neuropathy (CAN) is a well known complication of diabetes leading to impaired regulation of blood pressure and heart rate, and increases the risk of cardiac associated mortality of diabetes patients. The neurological diagnostics of CAN progression is an important problem that is being actively investigated. This paper uses data collected as part of a large and unique Diabetes Screening Complications Research Initiative (DiScRi) in Australia with data from numerous tests related to diabetes to classify CAN progression. The present paper is devoted to recent experimental investigations of the effectiveness of applications of decision trees, ensemble classifiers and multi-level ensemble classifiers for neurological diagnostics of CAN. We present the results of experiments comparing the effectiveness of ADTree, J48, NBTree, RandomTree, REPTree and SimpleCart decision tree classifiers. Our results show that SimpleCart was the most effective for the DiScRi data set in classifying CAN. We also investigated and compared the effectiveness of AdaBoost, Bagging, MultiBoost, Stacking, Decorate, Dagging, and Grading, based on Ripple Down Rules as examples of ensemble classifiers. Further, we investigated the effectiveness of these ensemble methods as a function of the base classifiers, and determined that Random Forest performed best as a base classifier, and AdaBoost, Bagging and Decorate achieved the best outcomes as meta-classifiers in this setting. Finally, we investigated the meta-classifiers that performed best in their ability to enhance the performance further within the framework of a multi-level classification paradigm. Experimental results show that the multi-level paradigm performed best when Bagging and Decorate were combined in the construction of a multi-level ensemble classifier
Efficient Feature Selection and ML Algorithm for Accurate Diagnostics
Machine learning algorithms have been deployed in numerous optimization, prediction and classification problems. This has endeared them for application in fields such as computer networks and medical diagnosis. Although these machine learning algorithms achieve convincing results in these fields, they face numerous challenges when deployed on imbalanced dataset. Consequently, these algorithms are often biased towards majority class, hence unable to generalize the learning process. In addition, they are unable to effectively deal with high-dimensional datasets. Moreover, the utilization of conventional feature selection techniques from a dataset based on attribute significance render them ineffective for majority of the diagnosis applications. In this paper, feature selection is executed using the more effective Neighbour Components Analysis (NCA). During the classification process, an ensemble classifier comprising of K-Nearest Neighbours (KNN), Naive Bayes (NB), Decision Tree (DT) and Support Vector Machine (SVM) is built, trained and tested. Finally, cross validation is carried out to evaluate the developed ensemble model. The results shows that the proposed classifier has the best performance in terms of precision, recall, F-measure and classification accuracy
Data-driven battery aging diagnostics and prognostics
Lithium-ion (Li-ion) batteries play a pivotal role in transforming the transportation sector from heavily relying on fossil fuels to a low-carbon solution. But, as an electrochemical device, a battery will inevitably undergo irreversible degradation over time. Therefore, accurate and reliable aging diagnostics and prognostics become indispensable for safe and efficient battery usage during operation. However, diverse aging mechanisms, stochastic usage patterns, and cell-to-cell variations impose significant challenges. With the ever-increasing awareness of the importance of vehicle operating data, more and more automotive companies have started to collect field data. Meanwhile, the rapid advancement in computational power has drawn tremendous attention to using machine learning algorithms to solve complex and challenging tasks. In this thesis, recent data-driven modeling techniques, using both field data collected during vehicle operation and laboratory cycling data, are applied to improve the overall performance of battery aging diagnostics and prognostics. A series of data-driven methods are proposed ranging from battery state of health estimation, future aging trajectory prediction, and remaining useful life prediction. The algorithms are extensively evaluated with various data sources of different battery kinds. The evaluation results indicate that the developed methods are accurate and robust, but more importantly, they are applicable to the harsh conditions encountered in real-world vehicle operations
Bayesian Spatial Binary Regression for Label Fusion in Structural Neuroimaging
Many analyses of neuroimaging data involve studying one or more regions of
interest (ROIs) in a brain image. In order to do so, each ROI must first be
identified. Since every brain is unique, the location, size, and shape of each
ROI varies across subjects. Thus, each ROI in a brain image must either be
manually identified or (semi-) automatically delineated, a task referred to as
segmentation. Automatic segmentation often involves mapping a previously
manually segmented image to a new brain image and propagating the labels to
obtain an estimate of where each ROI is located in the new image. A more recent
approach to this problem is to propagate labels from multiple manually
segmented atlases and combine the results using a process known as label
fusion. To date, most label fusion algorithms either employ voting procedures
or impose prior structure and subsequently find the maximum a posteriori
estimator (i.e., the posterior mode) through optimization. We propose using a
fully Bayesian spatial regression model for label fusion that facilitates
direct incorporation of covariate information while making accessible the
entire posterior distribution. We discuss the implementation of our model via
Markov chain Monte Carlo and illustrate the procedure through both simulation
and application to segmentation of the hippocampus, an anatomical structure
known to be associated with Alzheimer's disease.Comment: 24 pages, 10 figure
- …