3,112 research outputs found

    Clustering-based Source-aware Assessment of True Robustness for Learning Models

    Full text link
    We introduce a novel validation framework to measure the true robustness of learning models for real-world applications by creating source-inclusive and source-exclusive partitions in a dataset via clustering. We develop a robustness metric derived from source-aware lower and upper bounds of model accuracy even when data source labels are not readily available. We clearly demonstrate that even on a well-explored dataset like MNIST, challenging training scenarios can be constructed under the proposed assessment framework for two separate yet equally important applications: i) more rigorous learning model comparison and ii) dataset adequacy evaluation. In addition, our findings not only promise a more complete identification of trade-offs between model complexity, accuracy and robustness but can also help researchers optimize their efforts in data collection by identifying the less robust and more challenging class labels.Comment: Submitted to UAI 201

    Machine learning model selection with multi-objective Bayesian optimization and reinforcement learning

    Get PDF
    A machine learning system, including when used in reinforcement learning, is usually fed with only limited data, while aimed at training a model with good predictive performance that can generalize to an underlying data distribution. Within certain hypothesis classes, model selection chooses a model based on selection criteria calculated from available data, which usually serve as estimators of generalization performance of the model. One major challenge for model selection that has drawn increasing attention is the discrepancy between the data distribution where training data is sampled from and the data distribution at deployment. The model can over-fit in the training distribution, and fail to extrapolate in unseen deployment distributions, which can greatly harm the reliability of a machine learning system. Such a distribution shift challenge can become even more pronounced in high-dimensional data types like gene expression data, functional data and image data, especially in a decentralized learning scenario. Another challenge for model selection is efficient search in the hypothesis space. Since training a machine learning model usually takes a fair amount of resources, searching for an appropriate model with favorable configurations is by inheritance an expensive process, thus calling for efficient optimization algorithms. To tackle the challenge of distribution shift, novel resampling methods for the evaluation of robustness of neural network was proposed, as well as a domain generalization method using multi-objective bayesian optimization in decentralized learning scenario and variational inference in a domain unsupervised manner. To tackle the expensive model search problem, combining bayesian optimization and reinforcement learning in an interleaved manner was proposed for efficient search in a hierarchical conditional configuration space. Additionally, the effectiveness of using multi-objective bayesian optimization for model search in a decentralized learning scenarios was proposed and verified. A model selection perspective to reinforcement learning was proposed with associated contributions in tackling the problem of exploration in high dimensional state action spaces and sparse reward. Connections between statistical inference and control was summarized. Additionally, contributions in open source software development in related machine learning sub-topics like feature selection and functional data analysis with advanced tuning method and abundant benchmarking were also made

    Deep Convolutional Correlation Particle Filter for Visual Tracking

    Get PDF
    In this dissertation, we explore the advantages and limitations of the application of sequential Monte Carlo methods to visual tracking, which is a challenging computer vision problem. We propose six visual tracking models, each of which integrates a particle filter, a deep convolutional neural network, and a correlation filter. In our first model, we generate an image patch corresponding to each particle and use a convolutional neural network (CNN) to extract features from the corresponding image region. A correlation filter then computes the correlation response maps corresponding to these features, which are used to determine the particle weights and estimate the state of the target. We then introduce a particle filter that extends the target state by incorporating its size information. This model also utilizes a new adaptive correlation filtering approach that generates multiple target models to account for potential model update errors. We build upon that strategy to devise an adaptive particle filter that can decrease the number of particles in simple frames in which there is no challenging scenarios and the target model closely reflects the current appearance of the target. This strategy allows us to reduce the computational cost of the particle filter without negatively impacting its performance. This tracker also improves the likelihood model by generating multiple target models using varying model update rates based on the high-likelihood particles. We also propose a novel likelihood particle filter for CNN-correlation visual trackers. Our method uses correlation response maps to estimate likelihood distributions and employs these likelihoods as proposal densities to sample particles. Additionally, our particle filter searches for multiple modes in the likelihood distribution using a Gaussian mixture model. We further introduce an iterative particle filter that performs iterations to decrease the distance between particles and the peaks of their correlation maps which results in having a few more accurate particles in the end of iterations. Applying K-mean clustering method on the remaining particles determine the number of the clusters which is used in evaluation step and find the target state. Our approach ensures a consistent support for the posterior distribution. Thus, we do not need to perform resampling at every video frame, improving the utilization of prior distribution information. Finally, we introduce a novel framework which calculates the confidence score of the tracking algorithm at each video frame based on the correlation response maps of the particles. Our framework applies different model update rules according to the calculated confidence score, reducing tracking failures caused by model drift. The benefits of each of the proposed techniques are demonstrated through experiments using publicly available benchmark datasets

    The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection

    Full text link
    Machine Learning (ML) has emerged as a promising approach in healthcare, outperforming traditional statistical techniques. However, to establish ML as a reliable tool in clinical practice, adherence to best practices regarding data handling, experimental design, and model evaluation is crucial. This work summarizes and strictly observes such practices to ensure reproducible and reliable ML. Specifically, we focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. We investigate the impact of different data augmentation techniques and model complexity on the overall performance. We consider MRI data from ADNI dataset to address a classification problem employing 3D Convolutional Neural Network (CNN). The experiments are designed to compensate for data scarcity and initial random parameters by utilizing cross-validation and multiple training trials. Within this framework, we train 15 predictive models, considering three different data augmentation strategies and five distinct 3D CNN architectures, each varying in the number of convolutional layers. Specifically, the augmentation strategies are based on affine transformations, such as zoom, shift, and rotation, applied concurrently or separately. The combined effect of data augmentation and model complexity leads to a variation in prediction performance up to 10% of accuracy. When affine transformation are applied separately, the model is more accurate, independently from the adopted architecture. For all strategies, the model accuracy followed a concave behavior at increasing number of convolutional layers, peaking at an intermediate value of layers. The best model (8 CL, (B)) is the most stable across cross-validation folds and training trials, reaching excellent performance both on the testing set and on an external test set

    MRI Artefact Augmentation: Robust Deep Learning Systems and Automated Quality Control

    Get PDF
    Quality control (QC) of magnetic resonance imaging (MRI) is essential to establish whether a scan or dataset meets a required set of standards. In MRI, many potential artefacts must be identified so that problematic images can either be excluded or accounted for in further image processing or analysis. To date, the gold standard for the identification of these issues is visual inspection by experts. A primary source of MRI artefacts is caused by patient movement, which can affect clinical diagnosis and impact the accuracy of Deep Learning systems. In this thesis, I present a method to simulate motion artefacts from artefact-free images to augment convolutional neural networks (CNNs), increasing training appearance variability and robustness to motion artefacts. I show that models trained with artefact augmentation generalise better and are more robust to real-world artefacts, with negligible cost to performance on clean data. I argue that it is often better to optimise frameworks end-to-end with artefact augmentation rather than learning to retrospectively remove artefacts, thus enforcing robustness to artefacts at the feature level representation of the data. The labour-intensive and subjective nature of QC has increased interest in automated methods. To address this, I approach MRI quality estimation as the uncertainty in performing a downstream task, using probabilistic CNNs to predict segmentation uncertainty as a function of the input data. Extending this framework, I introduce a novel decoupled uncertainty model, enabling separate uncertainty predictions for different types of image degradation. Training with an extended k-space artefact augmentation pipeline, the model provides informative measures of uncertainty on problematic real-world scans classified by QC raters and enables sources of segmentation uncertainty to be identified. Suitable quality for algorithmic processing may differ from an image's perceptual quality. Exploring this, I pose MRI visual quality assessment as an image restoration task. Using Bayesian CNNs to recover clean images from noisy data, I show that the uncertainty indicates the possible recoverability of an image. A multi-task network combining uncertainty-aware artefact recovery with tissue segmentation highlights the distinction between visual and algorithmic quality, which has the impact that, depending on the downstream task, less data should be discarded for purely visual quality reasons
    • …
    corecore