10,018 research outputs found

    DPW-SDNet: Dual Pixel-Wavelet Domain Deep CNNs for Soft Decoding of JPEG-Compressed Images

    Full text link
    JPEG is one of the widely used lossy compression methods. JPEG-compressed images usually suffer from compression artifacts including blocking and blurring, especially at low bit-rates. Soft decoding is an effective solution to improve the quality of compressed images without changing codec or introducing extra coding bits. Inspired by the excellent performance of the deep convolutional neural networks (CNNs) on both low-level and high-level computer vision problems, we develop a dual pixel-wavelet domain deep CNNs-based soft decoding network for JPEG-compressed images, namely DPW-SDNet. The pixel domain deep network takes the four downsampled versions of the compressed image to form a 4-channel input and outputs a pixel domain prediction, while the wavelet domain deep network uses the 1-level discrete wavelet transformation (DWT) coefficients to form a 4-channel input to produce a DWT domain prediction. The pixel domain and wavelet domain estimates are combined to generate the final soft decoded result. Experimental results demonstrate the superiority of the proposed DPW-SDNet over several state-of-the-art compression artifacts reduction algorithms.Comment: CVPRW 201

    A Review of Feature and Data Fusion with Medical Images

    Full text link
    The fusion techniques that utilize multiple feature sets to form new features that are often more robust and contain useful information for future processing are referred to as feature fusion. The term data fusion is applied to the class of techniques used for combining decisions obtained from multiple feature sets to form global decisions. Feature and data fusion interchangeably represent two important classes of techniques that have proved to be of practical importance in a wide range of medical imaging problemsComment: Multisensor Data Fusion: From Algorithm and Architecture Design to Applications, CRC Press, 2015. arXiv admin note: substantial text overlap with arXiv:1401.016

    Microarrays denoising via smoothing of coefficients in wavelet domain

    Full text link
    We describe a novel method for removing noise (in wavelet domain) of unknown variance from microarrays. The method is based on a smoothing of the coefficients of the highest subbands. Specifically, we decompose the noisy microarray into wavelet subbands, apply smoothing within each highest subband, and reconstruct a microarray from the modified wavelet coefficients. This process is applied a single time, and exclusively to the first level of decomposition, i.e., in most of the cases, it is not necessary a multirresoltuion analysis. Denoising results compare favorably to the most of methods in use at the moment.Comment: 8 pages, 4 figures, 2 tables. arXiv admin note: text overlap with arXiv:1611.02302, arXiv:1607.03105; text overlap with arXiv:1212.0291 by other author

    Hierarchical method for cataract grading based on retinal images using improved Haar wavelet

    Full text link
    Cataracts, which are lenticular opacities that may occur at different lens locations, are the leading cause of visual impairment worldwide. Accurate and timely diagnosis can improve the quality of life of cataract patients. In this paper, a feature extraction-based method for grading cataract severity using retinal images is proposed. To obtain more appropriate features for the automatic grading, the Haar wavelet is improved according to the characteristics of retinal images. Retinal images of non-cataract, as well as mild, moderate, and severe cataracts, are automatically recognized using the improved Haar wavelet. A hierarchical strategy is used to transform the four-class classification problem into three adjacent two-class classification problems. Three sets of two-class classifiers based on a neural network are trained individually and integrated together to establish a complete classification system. The accuracies of the two-class classification (cataract and non-cataract) and four-class classification are 94.83% and 85.98%, respectively. The performance analysis demonstrates that the improved Haar wavelet feature achieves higher accuracy than the original Haar wavelet feature, and the fusion of three sets of two-class classifiers is superior to a simple four-class classifier. The discussion indicates that the retinal image-based method offers significant potential for cataract detection.Comment: Under Review by Information Fusion (Elsevier

    Using Wavelets to Analyze Similarities in Image-Classification Datasets

    Full text link
    Deep learning image classifiers usually rely on huge training sets and their training process can be described as learning the similarities and differences among training images. But, images in large training sets are not usually studied from this perspective and fine-level similarities and differences among images is usually overlooked. This is due to lack of fast and efficient computational methods to analyze the contents of these datasets. Some studies aim to identify the influential and redundant training images, but such methods require a model that is already trained on the entire training set. Here, using image processing and numerical analysis tools we develop a practical and fast method to analyze the similarities in image classification datasets. We show that such analysis can provide valuable insights about the datasets and the classification task at hand, prior to training a model. Our method uses wavelet decomposition of images and other numerical analysis tools, with no need for a pre-trained model. Interestingly, the results we obtain corroborate the previous results in the literature that analyzed the similarities using pre-trained CNNs. We show that similar images in standard datasets (such as CIFAR) can be identified in a few seconds, a significant speed-up compared to alternative methods in the literature. By removing the computational speed obstacle, it becomes practical to gain new insights about the contents of datasets and the models trained on them. We show that similarities between training and testing images may provide insights about the generalization of models. Finally, we investigate the similarities between images in relation to decision boundaries of a trained model

    An ANN-based Method for Detecting Vocal Fold Pathology

    Full text link
    There are different algorithms for vocal fold pathology diagnosis. These algorithms usually have three stages which are Feature Extraction, Feature Reduction and Classification. While the third stage implies a choice of a variety of machine learning methods, the first and second stages play a critical role in performance and accuracy of the classification system. In this paper we present initial study of feature extraction and feature reduction in the task of vocal fold pathology diagnosis. A new type of feature vector, based on wavelet packet decomposition and Mel-Frequency-Cepstral-Coefficients (MFCCs), is proposed. Also Principal Component Analysis (PCA) is used for feature reduction. An Artificial Neural Network is used as a classifier for evaluating the performance of our proposed method.Comment: 4 pages, 3 figures, Published with International Journal of Computer Applications (IJCA

    dipIQ: Blind Image Quality Assessment by Learning-to-Rank Discriminable Image Pairs

    Full text link
    Objective assessment of image quality is fundamentally important in many image processing tasks. In this work, we focus on learning blind image quality assessment (BIQA) models which predict the quality of a digital image with no access to its original pristine-quality counterpart as reference. One of the biggest challenges in learning BIQA models is the conflict between the gigantic image space (which is in the dimension of the number of image pixels) and the extremely limited reliable ground truth data for training. Such data are typically collected via subjective testing, which is cumbersome, slow, and expensive. Here we first show that a vast amount of reliable training data in the form of quality-discriminable image pairs (DIP) can be obtained automatically at low cost by exploiting large-scale databases with diverse image content. We then learn an opinion-unaware BIQA (OU-BIQA, meaning that no subjective opinions are used for training) model using RankNet, a pairwise learning-to-rank (L2R) algorithm, from millions of DIPs, each associated with a perceptual uncertainty level, leading to a DIP inferred quality (dipIQ) index. Extensive experiments on four benchmark IQA databases demonstrate that dipIQ outperforms state-of-the-art OU-BIQA models. The robustness of dipIQ is also significantly improved as confirmed by the group MAximum Differentiation (gMAD) competition method. Furthermore, we extend the proposed framework by learning models with ListNet (a listwise L2R algorithm) on quality-discriminable image lists (DIL). The resulting DIL Inferred Quality (dilIQ) index achieves an additional performance gain

    Image Quality Assessment: Unifying Structure and Texture Similarity

    Full text link
    Objective measures of image quality generally operate by comparing pixels of a "degraded" image to those of the original. Relative to human observers, these measures are overly sensitive to resampling of texture regions (e.g., replacing one patch of grass with another). Here, we develop the first full-reference image quality model with explicit tolerance to texture resampling. Using a convolutional neural network, we construct an injective and differentiable function that transforms images to multi-scale overcomplete representations. We demonstrate empirically that the spatial averages of the feature maps in this representation capture texture appearance, in that they provide a set of sufficient statistical constraints to synthesize a wide variety of texture patterns. We then describe an image quality method that combines correlations of these spatial averages ("texture similarity") with correlations of the feature maps ("structure similarity"). The parameters of the proposed measure are jointly optimized to match human ratings of image quality, while minimizing the reported distances between subimages cropped from the same texture images. Experiments show that the optimized method explains human perceptual scores, both on conventional image quality databases, as well as on texture databases. The measure also offers competitive performance on related tasks such as texture classification and retrieval. Finally, we show that our method is relatively insensitive to geometric transformations (e.g., translation and dilation), without use of any specialized training or data augmentation. Code is available at https://github.com/dingkeyan93/DISTS

    Adaptive Transform Domain Image Super-resolution Via Orthogonally Regularized Deep Networks

    Full text link
    Deep learning methods, in particular, trained Convolutional Neural Networks (CNN) have recently been shown to produce compelling results for single image Super-Resolution (SR). Invariably, a CNN is learned to map the Low Resolution (LR) image to its corresponding High Resolution (HR) version in the spatial domain. We propose a novel network structure for learning the SR mapping function in an image transform domain, specifically the Discrete Cosine Transform (DCT). As the first contribution, we show that DCT can be integrated into the network structure as a Convolutional DCT (CDCT) layer. With the CDCT layer, we construct the DCT Deep SR (DCT-DSR) network. We further extend the DCT-DSR to allow the CDCT layer to become trainable (i.e., optimizable). Because this layer represents an image transform, we enforce pairwise orthogonality constraints and newly formulated complexity order constraints on the individual basis functions/filters. This Orthogonally Regularized Deep SR network (ORDSR) simplifies the SR task by taking advantage of image transform domain while adapting the design of transform basis to the training image set. Experimental results show ORDSR achieves state-of-the-art SR image quality with fewer parameters than most of the deep CNN methods. A particular success of ORDSR is in overcoming the artifacts introduced by bicubic interpolation. A key burden of deep SR has been identified as the requirement of generous training LR and HR image pairs; ORSDR exhibits a much more graceful degradation as training size is reduced with significant benefits in the regime of limited training. Analysis of memory and computation requirements confirms that ORDSR can allow for a more efficient network with faster inference

    TRLF: An Effective Semi-fragile Watermarking Method for Tamper Detection and Recovery based on LWT and FNN

    Full text link
    This paper proposes a novel method for tamper detection and recovery using semi-fragile data hiding, based on Lifting Wavelet Transform (LWT) and Feed-Forward Neural Network (FNN). In TRLF, first, the host image is decomposed up to one level using LWT, and the Discrete Cosine Transform (DCT) is applied to each 2*2 blocks of diagonal details. Next, a random binary sequence is embedded in each block as the watermark by correlating DCDC coefficients. In authentication stage, first, the watermarked image geometry is reconstructed by using Speeded Up Robust Features (SURF) algorithm and extract watermark bits by using FNN. Afterward, logical exclusive-or operation between original and extracted watermark is applied to detect tampered region. Eventually, in the recovery stage, tampered regions are recovered by image digest which is generated by inverse halftoning technique. The performance and efficiency of TRLF and its robustness against various geometric, non-geometric and hybrid attacks are reported. From the experimental results, it can be seen that TRLF is superior in terms of robustness and quality of the digest and watermarked image respectively, compared to the-state-of-the-art fragile and semi-fragile watermarking methods. In addition, imperceptibility has been improved by using different correlation steps as the gain factor for flat (smooth) and texture (rough) blocks
    • …
    corecore