130,941 research outputs found

    Frequency Dropout: Feature-Level Regularization via Randomized Filtering

    Full text link
    Deep convolutional neural networks have shown remarkable performance on various computer vision tasks, and yet, they are susceptible to picking up spurious correlations from the training signal. So called `shortcuts' can occur during learning, for example, when there are specific frequencies present in the image data that correlate with the output predictions. Both high and low frequencies can be characteristic of the underlying noise distribution caused by the image acquisition rather than in relation to the task-relevant information about the image content. Models that learn features related to this characteristic noise will not generalize well to new data. In this work, we propose a simple yet effective training strategy, Frequency Dropout, to prevent convolutional neural networks from learning frequency-specific imaging features. We employ randomized filtering of feature maps during training which acts as a feature-level regularization. In this study, we consider common image processing filters such as Gaussian smoothing, Laplacian of Gaussian, and Gabor filtering. Our training strategy is model-agnostic and can be used for any computer vision task. We demonstrate the effectiveness of Frequency Dropout on a range of popular architectures and multiple tasks including image classification, domain adaptation, and semantic segmentation using both computer vision and medical imaging datasets. Our results suggest that the proposed approach does not only improve predictive accuracy but also improves robustness against domain shift.Comment: 15 page

    Learning scale-variant and scale-invariant features for deep image classification

    Get PDF
    Convolutional Neural Networks (CNNs) require large image corpora to be trained on classification tasks. The variation in image resolutions, sizes of objects and patterns depicted, and image scales, hampers CNN training and performance, because the task-relevant information varies over spatial scales. Previous work attempting to deal with such scale variations focused on encouraging scale-invariant CNN representations. However, scale-invariant representations are incomplete representations of images, because images contain scale-variant information as well. This paper addresses the combined development of scale-invariant and scale-variant representations. We propose a multi- scale CNN method to encourage the recognition of both types of features and evaluate it on a challenging image classification task involving task-relevant characteristics at multiple scales. The results show that our multi-scale CNN outperforms single-scale CNN. This leads to the conclusion that encouraging the combined development of a scale-invariant and scale-variant representation in CNNs is beneficial to image recognition performance

    Hand gesture recognition based on signals cross-correlation

    Get PDF

    Hopfield Networks in Relevance and Redundancy Feature Selection Applied to Classification of Biomedical High-Resolution Micro-CT Images

    Get PDF
    We study filter–based feature selection methods for classification of biomedical images. For feature selection, we use two filters — a relevance filter which measures usefulness of individual features for target prediction, and a redundancy filter, which measures similarity between features. As selection method that combines relevance and redundancy we try out a Hopfield network. We experimentally compare selection methods, running unitary redundancy and relevance filters, against a greedy algorithm with redundancy thresholds [9], the min-redundancy max-relevance integration [8,23,36], and our Hopfield network selection. We conclude that on the whole, Hopfield selection was one of the most successful methods, outperforming min-redundancy max-relevance when\ud more features are selected

    Multiscale Discriminant Saliency for Visual Attention

    Full text link
    The bottom-up saliency, an early stage of humans' visual attention, can be considered as a binary classification problem between center and surround classes. Discriminant power of features for the classification is measured as mutual information between features and two classes distribution. The estimated discrepancy of two feature classes very much depends on considered scale levels; then, multi-scale structure and discriminant power are integrated by employing discrete wavelet features and Hidden markov tree (HMT). With wavelet coefficients and Hidden Markov Tree parameters, quad-tree like label structures are constructed and utilized in maximum a posterior probability (MAP) of hidden class variables at corresponding dyadic sub-squares. Then, saliency value for each dyadic square at each scale level is computed with discriminant power principle and the MAP. Finally, across multiple scales is integrated the final saliency map by an information maximization rule. Both standard quantitative tools such as NSS, LCC, AUC and qualitative assessments are used for evaluating the proposed multiscale discriminant saliency method (MDIS) against the well-know information-based saliency method AIM on its Bruce Database wity eye-tracking data. Simulation results are presented and analyzed to verify the validity of MDIS as well as point out its disadvantages for further research direction.Comment: 16 pages, ICCSA 2013 - BIOCA sessio
    • …
    corecore