91,716 research outputs found
Color Representation in Deep Neural Networks
Convolutional neural networks are top-performers on image classification tasks. Understanding how they make use of color information in images may be useful for various tasks. In this paper we analyze the representation learned by a popular CNN to detect and characterize color-related features. We confirm the existence of some object- and color-specific units, as well as the effect of layer-depth on color-sensitivity and class-invariance
Fusion of Learned Multi-Modal Representations and Dense Trajectories for Emotional Analysis in Videos
When designing a video affective content analysis algorithm, one of the most important steps is the selection of discriminative features for the effective representation of video segments. The majority of existing affective content analysis methods either use low-level audio-visual features or generate handcrafted higher level representations based on these low-level features. We propose in this work to use deep learning methods, in particular convolutional neural networks (CNNs), in order to automatically learn and extract mid-level representations from raw data. To this end, we exploit the audio and visual modality of videos by employing Mel-Frequency Cepstral Coefficients (MFCC) and color values in the HSV color space. We also incorporate dense trajectory based motion features in order to further enhance the performance of the analysis. By means of multi-class support vector machines (SVMs) and fusion mechanisms, music video clips are classified into one of four affective categories representing the four quadrants of the Valence-Arousal (VA) space. Results obtained on a subset of the DEAP dataset show (1) that higher level representations perform better than low-level features, and (2) that incorporating motion information leads to a notable performance gain, independently from the chosen representation
Domain-adversarial neural networks to address the appearance variability of histopathology images
Preparing and scanning histopathology slides consists of several steps, each
with a multitude of parameters. The parameters can vary between pathology labs
and within the same lab over time, resulting in significant variability of the
tissue appearance that hampers the generalization of automatic image analysis
methods. Typically, this is addressed with ad-hoc approaches such as staining
normalization that aim to reduce the appearance variability. In this paper, we
propose a systematic solution based on domain-adversarial neural networks. We
hypothesize that removing the domain information from the model representation
leads to better generalization. We tested our hypothesis for the problem of
mitosis detection in breast cancer histopathology images and made a comparative
analysis with two other approaches. We show that combining color augmentation
with domain-adversarial training is a better alternative than standard
approaches to improve the generalization of deep learning methods.Comment: MICCAI 2017 Workshop on Deep Learning in Medical Image Analysi
Classification of diffraction patterns in single particle imaging experiments performed at X-ray free-electron lasers using a convolutional neural network
Single particle imaging (SPI) is a promising method for native structure
determination which has undergone a fast progress with the development of X-ray
Free-Electron Lasers. Large amounts of data are collected during SPI
experiments, driving the need for automated data analysis. The necessary data
analysis pipeline has a number of steps including binary object classification
(single versus multiple hits). Classification and object detection are areas
where deep neural networks currently outperform other approaches. In this work,
we use the fast object detector networks YOLOv2 and YOLOv3. By exploiting
transfer learning, a moderate amount of data is sufficient for training of the
neural network. We demonstrate here that a convolutional neural network (CNN)
can be successfully used to classify data from SPI experiments. We compare the
results of classification for the two different networks, with different depth
and architecture, by applying them to the same SPI data with different data
representation. The best results are obtained for YOLOv2 color images linear
scale classification, which shows an accuracy of about 97% with the precision
and recall of about 52% and 61%, respectively, which is in comparison to manual
data classification.Comment: 23 pages, 6 figures, 3 table
Interpreting Deep Visual Representations via Network Dissection
The success of recent deep convolutional neural networks (CNNs) depends on
learning hidden representations that can summarize the important factors of
variation behind the data. However, CNNs often criticized as being black boxes
that lack interpretability, since they have millions of unexplained model
parameters. In this work, we describe Network Dissection, a method that
interprets networks by providing labels for the units of their deep visual
representations. The proposed method quantifies the interpretability of CNN
representations by evaluating the alignment between individual hidden units and
a set of visual semantic concepts. By identifying the best alignments, units
are given human interpretable labels across a range of objects, parts, scenes,
textures, materials, and colors. The method reveals that deep representations
are more transparent and interpretable than expected: we find that
representations are significantly more interpretable than they would be under a
random equivalently powerful basis. We apply the method to interpret and
compare the latent representations of various network architectures trained to
solve different supervised and self-supervised training tasks. We then examine
factors affecting the network interpretability such as the number of the
training iterations, regularizations, different initializations, and the
network depth and width. Finally we show that the interpreted units can be used
to provide explicit explanations of a prediction given by a CNN for an image.
Our results highlight that interpretability is an important property of deep
neural networks that provides new insights into their hierarchical structure.Comment: *B. Zhou and D. Bau contributed equally to this work. 15 pages, 27
figure
- …