1,163 research outputs found

    Machine Learning for Multiclass Classification and Prediction of Alzheimer\u27s Disease

    Get PDF
    Alzheimer\u27s disease (AD) is an irreversible neurodegenerative disorder and a common form of dementia. This research aims to develop machine learning algorithms that diagnose and predict the progression of AD from multimodal heterogonous biomarkers with a focus placed on the early diagnosis. To meet this goal, several machine learning-based methods with their unique characteristics for feature extraction and automated classification, prediction, and visualization have been developed to discern subtle progression trends and predict the trajectory of disease progression. The methodology envisioned aims to enhance both the multiclass classification accuracy and prediction outcomes by effectively modeling the interplay between the multimodal biomarkers, handle the missing data challenge, and adequately extract all the relevant features that will be fed into the machine learning framework, all in order to understand the subtle changes that happen in the different stages of the disease. This research will also investigate the notion of multitasking to discover how the two processes of multiclass classification and prediction relate to one another in terms of the features they share and whether they could learn from one another for optimizing multiclass classification and prediction accuracy. This research work also delves into predicting cognitive scores of specific tests over time, using multimodal longitudinal data. The intent is to augment our prospects for analyzing the interplay between the different multimodal features used in the input space to the predicted cognitive scores. Moreover, the power of modality fusion, kernelization, and tensorization have also been investigated to efficiently extract important features hidden in the lower-dimensional feature space without being distracted by those deemed as irrelevant. With the adage that a picture is worth a thousand words, this dissertation introduces a unique color-coded visualization system with a fully integrated machine learning model for the enhanced diagnosis and prognosis of Alzheimer\u27s disease. The incentive here is to show that through visualization, the challenges imposed by both the variability and interrelatedness of the multimodal features could be overcome. Ultimately, this form of visualization via machine learning informs on the challenges faced with multiclass classification and adds insight into the decision-making process for a diagnosis and prognosis

    Deep Learning for Multiclass Classification, Predictive Modeling and Segmentation of Disease Prone Regions in Alzheimer’s Disease

    Get PDF
    One of the challenges facing accurate diagnosis and prognosis of Alzheimer’s Disease (AD) is identifying the subtle changes that define the early onset of the disease. This dissertation investigates three of the main challenges confronted when such subtle changes are to be identified in the most meaningful way. These are (1) the missing data challenge, (2) longitudinal modeling of disease progression, and (3) the segmentation and volumetric calculation of disease-prone brain areas in medical images. The scarcity of sufficient data compounded by the missing data challenge in many longitudinal samples exacerbates the problem as we seek statistical meaningfulness in multiclass classification and regression analysis. Although there are many participants in the AD Neuroimaging Initiative (ADNI) study, many of the observations have a lot of missing features which often lead to the exclusion of potentially valuable data points that could add significant meaning in many ongoing experiments. Motivated by the necessity of examining all participants, even those with missing tests or imaging modalities, multiple techniques of handling missing data in this domain have been explored. Specific attention was drawn to the Gradient Boosting (GB) algorithm which has an inherent capability of addressing missing values. Prior to applying state-of-the-art classifiers such as Support Vector Machine (SVM) and Random Forest (RF), the impact of imputing data in common datasets with numerical techniques has been also investigated and compared with the GB algorithm. Furthermore, to discriminate AD subjects from healthy control individuals, and Mild Cognitive Impairment (MCI), longitudinal multimodal heterogeneous data was modeled using recurring neural networks (RNNs). In the segmentation and volumetric calculation challenge, this dissertation places its focus on one of the most relevant disease-prone areas in many neurological and neurodegenerative diseases, the hippocampus region. Changes in hippocampus shape and volume are considered significant biomarkers for AD diagnosis and prognosis. Thus, a two-stage model based on integrating the Vision Transformer and Convolutional Neural Network (CNN) is developed to automatically locate, segment, and estimate the hippocampus volume from the brain 3D MRI. The proposed architecture was trained and tested on a dataset containing 195 brain MRIs from the 2019 Medical Segmentation Decathlon Challenge against the manually segmented regions provided therein and was deployed on 326 MRI from our own data collected through Mount Sinai Medical Center as part of the 1Florida Alzheimer Disease Research Center (ADRC)

    An Information Theoretic Approach For Feature Selection And Segmentation In Posterior Fossa Tumors

    Get PDF
    Posterior Fossa (PF) is a type of brain tumor located in or near brain stem and cerebellum. About 55% - 70 % pediatric brain tumors arise in the posterior fossa, compared with only 15% - 20% of adult tumors. For segmenting PF tumors we should have features to study the characteristics of tumors. In literature, different types of texture features such as Fractal Dimension (FD) and Multifractional Brownian Motion (mBm) have been exploited for measuring randomness associated with brain and tumor tissues structures, and the varying appearance of tissues in magnetic resonance images (MRI). For selecting best features techniques such as neural network and boosting methods have been exploited. However, neural network cannot descirbe about the properties of texture features. We explore methods such as information theroetic methods which can perform feature selection based on properties of texture features. The primary contribution of this dissertation is investigating efficacy of different image features such as intensity, fractal texture, and level - set shape in segmentation of PF tumor for pediatric patients. We explore effectiveness of using four different feature selection and three different segmentation techniques respectively to discriminate tumor regions from normal tissue in multimodal brain MRI. Our research suggest that Kullback - Leibler Divergence (KLD) measure for feature ranking and selection and Expectation Maximization (EM) algorithm for feature fusion and tumor segmentation offer the best performance for the patient data in this study. To improve segmentation accuracy, we need to consider abnormalities such as cyst, edema and necrosis which surround tumors. In this work, we exploit features which describe properties of cyst and technique which can be used to segment it. To achieve this goal, we extend the two class KLD techniques to multiclass feature selection techniques, so that we can effectively select features for tumor, cyst and non tumor tissues. We compute segemntation accuracy by computing number of pixels segemented to total number of pixels for the best features. For automated process we integrate the inhomoheneity correction, feature selection using KLD and segmentation in an integrated EM framework. To validate results we have used similarity coefficients for computing the robustness of segmented tumor and cyst

    Multiclass Classification of Brain MRI through DWT and GLCM Feature Extraction with Various Machine Learning Algorithms

    Get PDF
    This study delves into the domain of medical diagnostics, focusing on the crucial task of accurately classifying brain tumors to facilitate informed clinical decisions and optimize patient outcomes. Employing a diverse ensemble of machine learning algorithms, the paper addresses the challenge of multiclass brain tumor classification. The investigation centers around the utilization of two distinct datasets: the Brats dataset, encompassing cases of High-Grade Glioma (HGG) and Low-Grade Glioma (LGG), and the Sartaj dataset, comprising instances of Glioma, Meningioma, and No Tumor. Through the strategic deployment of Discrete Wavelet Transform (DWT) and Gray-Level Co-occurrence Matrix (GLCM) features, coupled with the implementation of Support Vector Machines (SVM), k-nearest Neighbors (KNN), Decision Trees (DT), Random Forest, and Gradient Boosting algorithms, the research endeavors to comprehensively explore avenues for achieving precise tumor classification. Preceding the classification process, the datasets undergo pre-processing and the extraction of salient features through DWT-derived frequency-domain characteristics and texture insights harnessed from GLCM. Subsequently, a detailed exposition of the selected algorithms is provided and elucidates the pertinent hyperparameters. The study's outcomes unveil noteworthy performance disparities across diverse algorithms and datasets. SVM and Random Forest algorithms exhibit commendable accuracy rates on the Brats dataset, while the Gradient Boosting algorithm demonstrates superior performance on the Sartaj dataset. The evaluation process encompasses precision, recall, and F1-score metrics, thereby providing a comprehensive assessment of the classification prowess of the employed algorithms

    Ensemble of random forests One vs. Rest classifiers for MCI and AD prediction using ANOVA cortical and subcortical feature selection and partial least squares.

    Get PDF
    Background: Alzheimer’s disease (AD) is the most common cause of dementia in the elderly and affects approximately 30 million individuals worldwide. Mild cognitive impairment (MCI) is very frequently a prodromal phase of AD, and existing studies have suggested that people with MCI tend to progress to AD at a rate of about 10 % to 15 % per year. However, the ability of clinicians and machine learning systems to predict AD based on MRI biomarkers at an early stage is still a challenging problem that can have a great impact in improving treatments. Method: The proposed system, developed by the SiPBA-UGR team for this challenge, is based on feature standardization, ANOVA feature selection, partial least squares feature dimension reduction and an ensemble of one vs. rest random forest classifiers. With the aim of improving its performance when discriminating healthy controls (HC) from MCI, a second binary classification level was introduced that reconsiders the HC and MCI predictions of the first level. Results: The system was trained and evaluated on an ADNI datasets that consist of T1-weighted MRI morphological measurements from HC, stable MCI, converter MCI and AD subjects. The proposed system yields a 56.25 % classification score on the test subset which consists of 160 real subjects. Comparison with Existing Method(s): The classifier yielded the best performance when compared to: i) One vs. One (OvO), One vs. Rest (OvR) and error correcting output codes (ECOC) as strategies for reducing the multiclass classification task to multiple binary classification problems, ii) support vector machines, gradient boosting classifier and random forest as base binary classifiers, and iii) bagging ensemble learning. Conclusions: A robust method has been proposed for the international challenge on MCI prediction based on MRI data.This work was supported by the MINECO/FEDER under TEC2015-64718-R project, the Consejería de Economía, Innovacion, Ciencia, y Empleo of the Junta de Andalucía under the P11-TIC-7103 Excellence Project and the Salvador de Madariaga Mobility Grants 2017

    Categorical classifiers in multiclass classification with imbalanced datasets

    Get PDF
    This paper discusses, in a multiclass classification setting, the issue of the choice of the so-called categorical classifier, which is the procedure or criterion that transforms the probabilities produced by a probabilistic classifier into a single category or class. The standard choice is the Bayes Classifier (BC), but it has some limits with rare classes. This paper studies the classification performance of the BC versus two alternatives, that are the Max Difference Classifier (MDC) and Max Ratio Classifier (MRC), through an extensive simulation and some case studies. The results show that both MDC and MRC are preferable to BC in a multiclass setting with imbalanced data

    Deep ensemble multitask classification of emergency medical call incidents combining multimodal data improves emergency medical dispatch

    Full text link
    [EN] The objective of this work was to develop a predictive model to aid non-clinical dispatchers to classify emergency medical call incidents by their life-threatening level (yes/no), admissible response delay (undelayable, minutes, hours, days) and emergency system jurisdiction (emergency system/primary care) in real time. We used a total of 1 244 624 independent incidents from the Valencian emergency medical dispatch service in Spain, compiled in retrospective from 2009 to 2012, including clinical features, demographics, circumstantial factors and free text dispatcher observations. Based on them, we designed and developed DeepEMC2, a deep ensemble multitask model integrating four subnetworks: three specialized to context, clinical and text data, respectively, and another to ensemble the former. The four subnetworks are composed in turn by multi-layer perceptron modules, bidirectional long short-term memory units and a bidirectional encoding representations from transformers module. DeepEMC2 showed a macro F1-score of 0.759 in life-threatening classification, 0.576 in admissible response delay and 0.757 in emergency system jurisdiction. These results show a substantial performance increase of 12.5 %, 17.5 % and 5.1 %, respectively, with respect to the current in-house triage protocol of the Valencian emergency medical dispatch service. Besides, DeepEMC2 significantly outperformed a set of baseline machine learning models, including naive bayes, logistic regression, random forest and gradient boosting (¿ = 0.05). Hence, DeepEMC2 is able to: 1) capture information present in emergency medical calls not considered by the existing triage protocol, and 2) model complex data dependencies not feasible by the tested baseline models. Likewise, our results suggest that most of this unconsidered information is present in the free text dispatcher observations. To our knowledge, this study describes the first deep learning model undertaking emergency medical call incidents classification. Its adoption in medical dispatch centers would potentially improve emergency dispatch processes, resulting in a positive impact in patient wellbeing and health services sustainability.This work has been supported by the Valencian agency for security and emergency response project A1800173041, the Ministry of Science, Innovation and Universities of Spain program FPU18/06441 and the EU Horizon 2020 project InAdvance 825750Ferri-Borredà, P.; Sáez Silvestre, C.; Felix-De Castro, A.; Juan-Albarracín, J.; Blanes-Selva, V.; Sánchez-Cuesta, P.; Garcia-Gomez, JM. (2021). Deep ensemble multitask classification of emergency medical call incidents combining multimodal data improves emergency medical dispatch. Artificial Intelligence in Medicine. 117:1-13. https://doi.org/10.1016/j.artmed.2021.102088S11311

    PAC-Bayesian Majority Vote for Late Classifier Fusion

    Full text link
    A lot of attention has been devoted to multimedia indexing over the past few years. In the literature, we often consider two kinds of fusion schemes: The early fusion and the late fusion. In this paper we focus on late classifier fusion, where one combines the scores of each modality at the decision level. To tackle this problem, we investigate a recent and elegant well-founded quadratic program named MinCq coming from the Machine Learning PAC-Bayes theory. MinCq looks for the weighted combination, over a set of real-valued functions seen as voters, leading to the lowest misclassification rate, while making use of the voters' diversity. We provide evidence that this method is naturally adapted to late fusion procedure. We propose an extension of MinCq by adding an order- preserving pairwise loss for ranking, helping to improve Mean Averaged Precision measure. We confirm the good behavior of the MinCq-based fusion approaches with experiments on a real image benchmark.Comment: 7 pages, Research repor
    corecore