139 research outputs found

    Wavelet Integrated CNNs for Noise-Robust Image Classification

    Full text link
    Convolutional Neural Networks (CNNs) are generally prone to noise interruptions, i.e., small image noise can cause drastic changes in the output. To suppress the noise effect to the final predication, we enhance CNNs by replacing max-pooling, strided-convolution, and average-pooling with Discrete Wavelet Transform (DWT). We present general DWT and Inverse DWT (IDWT) layers applicable to various wavelets like Haar, Daubechies, and Cohen, etc., and design wavelet integrated CNNs (WaveCNets) using these layers for image classification. In WaveCNets, feature maps are decomposed into the low-frequency and high-frequency components during the down-sampling. The low-frequency component stores main information including the basic object structures, which is transmitted into the subsequent layers to extract robust high-level features. The high-frequency components, containing most of the data noise, are dropped during inference to improve the noise-robustness of the WaveCNets. Our experimental results on ImageNet and ImageNet-C (the noisy version of ImageNet) show that WaveCNets, the wavelet integrated versions of VGG, ResNets, and DenseNet, achieve higher accuracy and better noise-robustness than their vanilla versions.Comment: CVPR accepted pape

    Statistical Methods for Image Registration and Denoising

    Get PDF
    This dissertation describes research into image processing techniques that enhance military operational and support activities. The research extends existing work on image registration by introducing a novel method that exploits local correlations to improve the performance of projection-based image registration algorithms. The dissertation also extends the bounds on image registration performance for both projection-based and full-frame image registration algorithms and extends the Barankin bound from the one-dimensional case to the problem of two-dimensional image registration. It is demonstrated that in some instances, the Cramer-Rao lower bound is an overly-optimistic predictor of image registration performance and that under some conditions, the Barankin bound is a better predictor of shift estimator performance. The research also looks at the related problem of single-frame image denoising using block-based methods. The research introduces three algorithms that operate by identifying regions of interest within a noise-corrupted image and then generating noise free estimates of the regions as averages of similar regions in the image

    Rotationally Invariant Image Representation for Viewing Direction Classification in Cryo-EM

    Full text link
    We introduce a new rotationally invariant viewing angle classification method for identifying, among a large number of Cryo-EM projection images, similar views without prior knowledge of the molecule. Our rotationally invariant features are based on the bispectrum. Each image is denoised and compressed using steerable principal component analysis (PCA) such that rotating an image is equivalent to phase shifting the expansion coefficients. Thus we are able to extend the theory of bispectrum of 1D periodic signals to 2D images. The randomized PCA algorithm is then used to efficiently reduce the dimensionality of the bispectrum coefficients, enabling fast computation of the similarity between any pair of images. The nearest neighbors provide an initial classification of similar viewing angles. In this way, rotational alignment is only performed for images with their nearest neighbors. The initial nearest neighbor classification and alignment are further improved by a new classification method called vector diffusion maps. Our pipeline for viewing angle classification and alignment is experimentally shown to be faster and more accurate than reference-free alignment with rotationally invariant K-means clustering, MSA/MRA 2D classification, and their modern approximations

    Systematic approach to nonlinear filtering associated with aggregation operators. Part 2. Frechet MIMO-filters

    Get PDF
    Median filtering has been widely used in scalar-valued image processing as an edge preserving operation. The basic idea is that the pixel value is replaced by the median of the pixels contained in a window around it. In this work, this idea is extended onto vector-valued images. It is based on the fact that the median is also the value that minimizes the sum of distances between all grey-level pixels in the window. The Frechet median of a discrete set of vector-valued pixels in a metric space with a metric is the point minimizing the sum of metric distances to the all sample pixels. In this paper, we extend the notion of the Frechet median to the general Frechet median, which minimizes the Frechet cost function (FCF) in the form of aggregation function of metric distances, instead of the ordinary sum. Moreover, we propose use an aggregation distance instead of classical metric distance. We use generalized Frechet median for constructing new nonlinear Frechet MIMO-filters for multispectral image processing. (C) 2017 The Authors. Published by Elsevier Ltd.This work was supported by grants the RFBR No 17-07-00886, No 17-29-03369 and by Ural State Forest University Engineering's Center of Excellence in "Quantum and Classical Information Technologies for Remote Sensing Systems"

    Automatic Look-Up Table Based Real-Time Phase Unwrapping for Phase Measuring Profilometry and Optimal Reference Frequency Selection

    Get PDF
    For temporal phase unwrapping in phase measuring profilometry, it has recently been reported that two phases with co-prime frequencies can be absolutely unwrapped using a look-up table; however, frequency selection and table construction has been performed manually without optimization. In this paper, a universal phase unwrapping method is proposed to unwrap phase flexibly and automatically by using geometric analysis, and thus we can programmatically build a one-dimensional or two-dimensional look-up table for arbitrary two co-prime frequencies to correctly unwrap phases in real time. Moreover, a phase error model related to the defocus effect is derived to figure out an optimal reference frequency co-prime to the principal frequency. Experimental results verify the correctness and computational efficiency of the proposed method
    corecore