33 research outputs found

    Adversarial Image Generation by Spatial Transformation in Perceptual Colorspaces

    Full text link
    Deep neural networks are known to be vulnerable to adversarial perturbations. The amount of these perturbations are generally quantified using LpL_p metrics, such as L0L_0, L2L_2 and L∞L_\infty. However, even when the measured perturbations are small, they tend to be noticeable by human observers since LpL_p distance metrics are not representative of human perception. On the other hand, humans are less sensitive to changes in colorspace. In addition, pixel shifts in a constrained neighborhood are hard to notice. Motivated by these observations, we propose a method that creates adversarial examples by applying spatial transformations, which creates adversarial examples by changing the pixel locations independently to chrominance channels of perceptual colorspaces such as YCbCrYC_{b}C_{r} and CIELABCIELAB, instead of making an additive perturbation or manipulating pixel values directly. In a targeted white-box attack setting, the proposed method is able to obtain competitive fooling rates with very high confidence. The experimental evaluations show that the proposed method has favorable results in terms of approximate perceptual distance between benign and adversarially generated images. The source code is publicly available at https://github.com/ayberkydn/stadv-torc

    Learning efficient image representations: Connections between statistics and neuroscience

    Get PDF
    This thesis summarizes different works developed in the framework of analyzing the relation between image processing, statistics and neuroscience. These relations are analyzed from the efficient coding hypothesis point of view (H. Barlow [1961] and Attneave [1954]). This hypothesis suggests that the human visual system has been adapted during the ages in order to process the visual information in an efficient way, i.e. taking advantage of the statistical regularities of the visual world. Under this classical idea different works in different directions are developed. One direction is analyzing the statistical properties of a revisited, extended and fitted classical model of the human visual system. No statistical information is used in the model. Results show that this model obtains a representation with good statistical properties, which is a new evidence in favor of the efficient coding hypothesis. From the statistical point of view, different methods are proposed and optimized using natural images. The models obtained using these statistical methods show similar behavior to the human visual system, both in the spatial and color dimensions, which are also new evidences of the efficient coding hypothesis. Applications in image processing are an important part of the Thesis. Statistical and neuroscience based methods are employed to develop a wide set of image processing algorithms. Results of these methods in denoising, classification, synthesis and quality assessment are comparable to some of the most successful current methods

    Learning Probabilistic Graphical Models for Image Segmentation

    Get PDF
    Probabilistic graphical models provide a powerful framework for representing image structures. Based on this concept, many inference and learning algorithms have been developed. However, both algorithm classes are NP-hard combinatorial problems in the general case. As a consequence, relaxation methods were developed to approximate the original problems but with the benefit of being computationally efficient. In this work we consider the learning problem on binary graphical models and their relaxations. Two novel methods for determining the model parameters in discrete energy functions from training data are proposed. Learning the model parameters overcomes the problem of heuristically determining them. Motivated by common learning methods which aim at minimizing the training error measured by a loss function we develop a new learning method similar in fashion to structured SVM. However, computationally more efficient. We term this method as linearized approach (LA) as it is restricted to linearly dependent potentials. The linearity of LA is crucial to come up with a tight convex relaxation, which allows to use off-the-shelf inference solvers to approach subproblems which emerge from solving the overall problem. However, this type of learning methods almost never yield optimal solutions or perfect performance on the training data set. So what happens if the learned graphical model on the training data would lead to exact ground segmentation? Will this give a benefit when predicting? Motivated by the idea of inverse optimization, we take advantage of inverse linear programming to develop a learning approach, referred to as inverse linear programming approach (invLPA). It further refines the graphical models trained, using the previously introduced methods and is capable to perfectly predict ground truth on training data. The empirical results from implementing invLPA give answers to our questions posed before. LA is able to learn both unary and pairwise potentials jointly while with invLPA this is not possible due to the representation we use. On the other hand, invLPA does not rely on a certain form for the potentials and thus is flexible in the choice of the fitting method. Although the corrected potentials with invLPA always result in ground truth segmentation of the training data, invLPA is able to find corrections on the foreground segments only. Due to the relaxed problem formulation this does not affect the final segmentation result. Moreover, as long as we initialize invLPA with model parameters of a learning method performing sufficiently well, this drawback of invLPA does not significantly affect the final prediction result. The performance of the proposed learning methods is evaluated on both synthetic and real world datasets. We demonstrate that LA is competitive in comparison to other parameter learning methods using loss functions based on Maximum a Posteriori Marginal (MPM) and Maximum Likelihood Estimation (MLE). Moreover, we illustrate the benefits of learning with inverse linear programming. In a further experiment we demonstrate the versatility of our learning methods by applying LA to learning motion segmentation in video sequences and comparing it to state-of-the-art segmentation algorithms

    Mathematical and Data-driven Pattern Representation with Applications in Image Processing, Computer Graphics, and Infinite Dimensional Dynamical Data Mining

    Get PDF
    Patterns represent the spatial or temporal regularities intrinsic to various phenomena in nature, society, art, and science. From rigid ones with well-defined generative rules to flexible ones implied by unstructured data, patterns can be assigned to a spectrum. On one extreme, patterns are completely described by algebraic systems where each individual pattern is obtained by repeatedly applying simple operations on primitive elements. On the other extreme, patterns are perceived as visual or frequency regularities without any prior knowledge of the underlying mechanisms. In this thesis, we aim at demonstrating some mathematical techniques for representing patterns traversing the aforementioned spectrum, which leads to qualitative analysis of the patterns' properties and quantitative prediction of the modeled behaviors from various perspectives. We investigate lattice patterns from material science, shape patterns from computer graphics, submanifold patterns encountered in point cloud processing, color perception patterns applied in underwater image processing, dynamic patterns from spatial-temporal data, and low-rank patterns exploited in medical image reconstruction. For different patterns and based on their dependence on structured or unstructured data, we present suitable mathematical representations using techniques ranging from group theory to deep neural networks.Ph.D

    Segmentation d'images couleurs et multispectrales de la peau

    Get PDF
    La délimitation précise du contour des lésions pigmentées sur des images est une première étape importante pour le diagnostic assisté par ordinateur du mélanome. Cette thèse présente une nouvelle approche de la détection automatique du contour des lésions pigmentaires sur des images couleurs ou multispectrales de la peau. Nous présentons d'abord la notion de minimisation d'énergie par coupes de graphes en terme de Maxima A-Posteriori d'un champ de Markov. Après un rapide état de l'art, nous étudions l'influence des paramètres de l'algorithme sur les contours d'images couleurs. Dans ce cadre, nous proposons une fonction d'énergie basée sur des classifieurs performants (Machines à support de vecteurs et Forêts aléatoires) et sur un vecteur de caractéristiques calculé sur un voisinage local. Pour la segmentation de mélanomes, nous estimons une carte de concentration des chromophores de la peau, indices discriminants du mélanomes, à partir d'images couleurs ou multispectrales, et intégrons ces caractéristiques au vecteur. Enfin, nous détaillons le schéma global de la segmentation automatique de mélanomes, comportant une étape de sélection automatique des "graines" utiles à la coupure de graphes ainsi que la sélection des caractéristiques discriminantes. Cet outil est comparé favorablement aux méthodes classiques à base de coupure de graphes en terme de précision et de robustesse.Accurate border delineation of pigmented skin lesion (PSL) images is a vital first step in computer-aided diagnosis (CAD) of melanoma. This thesis presents a novel approach of automatic PSL border detection on color and multispectral skin images. We first introduce the concept of energy minimization by graph cuts in terms of maximum a posteriori estimation of a Markov random field (MAP-MRF framework). After a brief state of the art in interactive graph-cut based segmentation methods, we study the influence of parameters of the segmentation algorithm on color images. Under this framework, we propose an energy function based on efficient classifiers (support vector machines and random forests) and a feature vector calculated on a local neighborhood. For the segmentation of melanoma, we estimate the concentration maps of skin chromophores, discriminating indices of melanomas from color and multispectral images, and integrate these features in a vector. Finally, we detail an global framework of automatic segmentation of melanoma, which comprises two main stages: automatic selection of "seeds" useful for graph cuts and the selection of discriminating features. This tool is compared favorably to classic graph-cut based segmentation methods in terms of accuracy and robustness.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF

    Video content analysis for intelligent forensics

    Get PDF
    The networks of surveillance cameras installed in public places and private territories continuously record video data with the aim of detecting and preventing unlawful activities. This enhances the importance of video content analysis applications, either for real time (i.e. analytic) or post-event (i.e. forensic) analysis. In this thesis, the primary focus is on four key aspects of video content analysis, namely; 1. Moving object detection and recognition, 2. Correction of colours in the video frames and recognition of colours of moving objects, 3. Make and model recognition of vehicles and identification of their type, 4. Detection and recognition of text information in outdoor scenes. To address the first issue, a framework is presented in the first part of the thesis that efficiently detects and recognizes moving objects in videos. The framework targets the problem of object detection in the presence of complex background. The object detection part of the framework relies on background modelling technique and a novel post processing step where the contours of the foreground regions (i.e. moving object) are refined by the classification of edge segments as belonging either to the background or to the foreground region. Further, a novel feature descriptor is devised for the classification of moving objects into humans, vehicles and background. The proposed feature descriptor captures the texture information present in the silhouette of foreground objects. To address the second issue, a framework for the correction and recognition of true colours of objects in videos is presented with novel noise reduction, colour enhancement and colour recognition stages. The colour recognition stage makes use of temporal information to reliably recognize the true colours of moving objects in multiple frames. The proposed framework is specifically designed to perform robustly on videos that have poor quality because of surrounding illumination, camera sensor imperfection and artefacts due to high compression. In the third part of the thesis, a framework for vehicle make and model recognition and type identification is presented. As a part of this work, a novel feature representation technique for distinctive representation of vehicle images has emerged. The feature representation technique uses dense feature description and mid-level feature encoding scheme to capture the texture in the frontal view of the vehicles. The proposed method is insensitive to minor in-plane rotation and skew within the image. The capability of the proposed framework can be enhanced to any number of vehicle classes without re-training. Another important contribution of this work is the publication of a comprehensive up to date dataset of vehicle images to support future research in this domain. The problem of text detection and recognition in images is addressed in the last part of the thesis. A novel technique is proposed that exploits the colour information in the image for the identification of text regions. Apart from detection, the colour information is also used to segment characters from the words. The recognition of identified characters is performed using shape features and supervised learning. Finally, a lexicon based alignment procedure is adopted to finalize the recognition of strings present in word images. Extensive experiments have been conducted on benchmark datasets to analyse the performance of proposed algorithms. The results show that the proposed moving object detection and recognition technique superseded well-know baseline techniques. The proposed framework for the correction and recognition of object colours in video frames achieved all the aforementioned goals. The performance analysis of the vehicle make and model recognition framework on multiple datasets has shown the strength and reliability of the technique when used within various scenarios. Finally, the experimental results for the text detection and recognition framework on benchmark datasets have revealed the potential of the proposed scheme for accurate detection and recognition of text in the wild

    Understanding perceived quality through visual representations

    Get PDF
    The formatting of images can be considered as an optimization problem, whose cost function is a quality assessment algorithm. There is a trade-off between bit budget per pixel and quality. To maximize the quality and minimize the bit budget, we need to measure the perceived quality. In this thesis, we focus on understanding perceived quality through visual representations that are based on visual system characteristics and color perception mechanisms. Specifically, we use the contrast sensitivity mechanisms in retinal ganglion cells and the suppression mechanisms in cortical neurons. We utilize color difference equations and color name distances to mimic pixel-wise color perception and a bio-inspired model to formulate center surround effects. Based on these formulations, we introduce two novel image quality estimators PerSIM and CSV, and a new image quality-assistance method BLeSS. We combine our findings from visual system and color perception with data-driven methods to generate visual representations and measure their quality. The majority of existing data-driven methods require subjective scores or degraded images. In contrast, we follow an unsupervised approach that only utilizes generic images. We introduce a novel unsupervised image quality estimator UNIQUE, and extend it with multiple models and layers to obtain MS-UNIQUE and DMS-UNIQUE. In addition to introducing quality estimators, we analyze the role of spatial pooling and boosting in image quality assessment.Ph.D

    Variational Tensor-Based Models for Image Diffusion in Non-Linear Domains

    Full text link

    Towards object-based image editing

    Get PDF
    corecore