3,256 research outputs found
Interpreting Adversarially Trained Convolutional Neural Networks
We attempt to interpret how adversarially trained convolutional neural
networks (AT-CNNs) recognize objects. We design systematic approaches to
interpret AT-CNNs in both qualitative and quantitative ways and compare them
with normally trained models. Surprisingly, we find that adversarial training
alleviates the texture bias of standard CNNs when trained on object recognition
tasks, and helps CNNs learn a more shape-biased representation. We validate our
hypothesis from two aspects. First, we compare the salience maps of AT-CNNs and
standard CNNs on clean images and images under different transformations. The
comparison could visually show that the prediction of the two types of CNNs is
sensitive to dramatically different types of features. Second, to achieve
quantitative verification, we construct additional test datasets that destroy
either textures or shapes, such as style-transferred version of clean data,
saturated images and patch-shuffled ones, and then evaluate the classification
accuracy of AT-CNNs and normal CNNs on these datasets. Our findings shed some
light on why AT-CNNs are more robust than those normally trained ones and
contribute to a better understanding of adversarial training over CNNs from an
interpretation perspective.Comment: To apper in ICML1
Perceptual-based textures for scene labeling: a bottom-up and a top-down approach
Due to the semantic gap, the automatic interpretation of digital images is a very challenging task. Both the segmentation and classification are intricate because of the high variation of the data. Therefore, the application of appropriate features is of utter importance. This paper presents biologically inspired texture features for material classification and interpreting outdoor scenery images. Experiments show that the presented texture features obtain the best classification results for material recognition compared to other well-known texture features, with an average classification rate of 93.0%. For scene analysis, both a bottom-up and top-down strategy are employed to bridge the semantic gap. At first, images are segmented into regions based on the perceptual texture and next, a semantic label is calculated for these regions. Since this emerging interpretation is still error prone, domain knowledge is ingested to achieve a more accurate description of the depicted scene. By applying both strategies, 91.9% of the pixels from outdoor scenery images obtained a correct label
Learning Convolutional Networks for Content-weighted Image Compression
Lossy image compression is generally formulated as a joint rate-distortion
optimization to learn encoder, quantizer, and decoder. However, the quantizer
is non-differentiable, and discrete entropy estimation usually is required for
rate control. These make it very challenging to develop a convolutional network
(CNN)-based image compression system. In this paper, motivated by that the
local information content is spatially variant in an image, we suggest that the
bit rate of the different parts of the image should be adapted to local
content. And the content aware bit rate is allocated under the guidance of a
content-weighted importance map. Thus, the sum of the importance map can serve
as a continuous alternative of discrete entropy estimation to control
compression rate. And binarizer is adopted to quantize the output of encoder
due to the binarization scheme is also directly defined by the importance map.
Furthermore, a proxy function is introduced for binary operation in backward
propagation to make it differentiable. Therefore, the encoder, decoder,
binarizer and importance map can be jointly optimized in an end-to-end manner
by using a subset of the ImageNet database. In low bit rate image compression,
experiments show that our system significantly outperforms JPEG and JPEG 2000
by structural similarity (SSIM) index, and can produce the much better visual
result with sharp edges, rich textures, and fewer artifacts
A Hybrid Deep Learning Approach for Texture Analysis
Texture classification is a problem that has various applications such as
remote sensing and forest species recognition. Solutions tend to be custom fit
to the dataset used but fails to generalize. The Convolutional Neural Network
(CNN) in combination with Support Vector Machine (SVM) form a robust selection
between powerful invariant feature extractor and accurate classifier. The
fusion of experts provides stability in classification rates among different
datasets
A Self-Organizing Neural System for Learning to Recognize Textured Scenes
A self-organizing ARTEX model is developed to categorize and classify textured image regions. ARTEX specializes the FACADE model of how the visual cortex sees, and the ART model of how temporal and prefrontal cortices interact with the hippocampal system to learn visual recognition categories and their names. FACADE processing generates a vector of boundary and surface properties, notably texture and brightness properties, by utilizing multi-scale filtering, competition, and diffusive filling-in. Its context-sensitive local measures of textured scenes can be used to recognize scenic properties that gradually change across space, as well a.s abrupt texture boundaries. ART incrementally learns recognition categories that classify FACADE output vectors, class names of these categories, and their probabilities. Top-down expectations within ART encode learned prototypes that pay attention to expected visual features. When novel visual information creates a poor match with the best existing category prototype, a memory search selects a new category with which classify the novel data. ARTEX is compared with psychophysical data, and is benchmarked on classification of natural textures and synthetic aperture radar images. It outperforms state-of-the-art systems that use rule-based, backpropagation, and K-nearest neighbor classifiers.Defense Advanced Research Projects Agency; Office of Naval Research (N00014-95-1-0409, N00014-95-1-0657
Preattentive texture discrimination with early vision mechanisms
We present a model of human preattentive texture perception. This model consists of three stages: (1) convolution of the image with a bank of even-symmetric linear filters followed by half-wave rectification to give a set of responses modeling outputs of V1 simple cells, (2) inhibition, localized in space, within and among the neural-response profiles that results in the suppression of weak responses when there are strong responses at the same or nearby locations, and (3) texture-boundary detection by using wide odd-symmetric mechanisms. Our model can predict the salience of texture boundaries in any arbitrary gray-scale image. A computer implementation of this model has been tested on many of the classic stimuli from psychophysical literature. Quantitative predictions of the degree of discriminability of different texture pairs match well with experimental measurements of discriminability in human observers
- …