Search CORE

91,716 research outputs found

Color Representation in Deep Neural Networks

Author: Collins Edo
Engilberge Martin
Süsstrunk Sabine
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/10/2017
Field of study

Convolutional neural networks are top-performers on image classification tasks. Understanding how they make use of color information in images may be useful for various tasks. In this paper we analyze the representation learned by a popular CNN to detect and characterize color-related features. We confirm the existence of some object- and color-specific units, as well as the effect of layer-depth on color-sensitivity and class-invariance

Infoscience - École polytechnique fédérale de Lausanne

Fusion of Learned Multi-Modal Representations and Dense Trajectories for Emotional Analysis in Videos

Author: Acar Esra
Albayrak Sahin
Hopfgartner Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2015
Field of study

When designing a video affective content analysis algorithm, one of the most important steps is the selection of discriminative features for the effective representation of video segments. The majority of existing affective content analysis methods either use low-level audio-visual features or generate handcrafted higher level representations based on these low-level features. We propose in this work to use deep learning methods, in particular convolutional neural networks (CNNs), in order to automatically learn and extract mid-level representations from raw data. To this end, we exploit the audio and visual modality of videos by employing Mel-Frequency Cepstral Coefficients (MFCC) and color values in the HSV color space. We also incorporate dense trajectory based motion features in order to further enhance the performance of the analysis. By means of multi-class support vector machines (SVMs) and fusion mechanisms, music video clips are classified into one of four affective categories representing the four quadrants of the Valence-Arousal (VA) space. Results obtained on a subset of the DEAP dataset show (1) that higher level representations perform better than low-level features, and (2) that incorporating motion information leads to a notable performance gain, independently from the chosen representation

Crossref

Enlighten

Domain-adversarial neural networks to address the appearance variability of histopathology images

Author: Eppenhof Koen A. J.
Lafarge Maxime W.
Moeskops Pim
Pluim Josien P. W.
Veta Mitko
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/07/2017
Field of study

Preparing and scanning histopathology slides consists of several steps, each with a multitude of parameters. The parameters can vary between pathology labs and within the same lab over time, resulting in significant variability of the tissue appearance that hampers the generalization of automatic image analysis methods. Typically, this is addressed with ad-hoc approaches such as staining normalization that aim to reduce the appearance variability. In this paper, we propose a systematic solution based on domain-adversarial neural networks. We hypothesize that removing the domain information from the model representation leads to better generalization. We tested our hypothesis for the problem of mitosis detection in breast cancer histopathology images and made a comparative analysis with two other approaches. We show that combining color augmentation with domain-adversarial training is a better alternative than standard approaches to improve the generalization of deep learning methods.Comment: MICCAI 2017 Workshop on Deep Learning in Medical Image Analysi

arXiv.org e-Print Archive

Crossref

Classification of diffraction patterns in single particle imaging experiments performed at X-ray free-electron lasers using a convolutional neural network

Author: Assalauova Dameli
Bobkov Sergey A.
Gelisio Luca
Ignatenko Alexandr
Ilyin Viacheslav A.
Teslyuk Anton B.
Vartanyants Ivan A.
Publication venue: 'IOP Publishing'
Publication date: 17/08/2020
Field of study

Single particle imaging (SPI) is a promising method for native structure determination which has undergone a fast progress with the development of X-ray Free-Electron Lasers. Large amounts of data are collected during SPI experiments, driving the need for automated data analysis. The necessary data analysis pipeline has a number of steps including binary object classification (single versus multiple hits). Classification and object detection are areas where deep neural networks currently outperform other approaches. In this work, we use the fast object detector networks YOLOv2 and YOLOv3. By exploiting transfer learning, a moderate amount of data is sufficient for training of the neural network. We demonstrate here that a convolutional neural network (CNN) can be successfully used to classify data from SPI experiments. We compare the results of classification for the two different networks, with different depth and architecture, by applying them to the same SPI data with different data representation. The best results are obtained for YOLOv2 color images linear scale classification, which shows an accuracy of about 97% with the precision and recall of about 52% and 61%, respectively, which is in comparison to manual data classification.Comment: 23 pages, 6 figures, 3 table

arXiv.org e-Print Archive

DESY

Interpreting Deep Visual Representations via Network Dissection

Author: Bau David
Oliva Aude
Torralba Antonio
Zhou Bolei
Publication venue
Publication date: 26/06/2018
Field of study

The success of recent deep convolutional neural networks (CNNs) depends on learning hidden representations that can summarize the important factors of variation behind the data. However, CNNs often criticized as being black boxes that lack interpretability, since they have millions of unexplained model parameters. In this work, we describe Network Dissection, a method that interprets networks by providing labels for the units of their deep visual representations. The proposed method quantifies the interpretability of CNN representations by evaluating the alignment between individual hidden units and a set of visual semantic concepts. By identifying the best alignments, units are given human interpretable labels across a range of objects, parts, scenes, textures, materials, and colors. The method reveals that deep representations are more transparent and interpretable than expected: we find that representations are significantly more interpretable than they would be under a random equivalently powerful basis. We apply the method to interpret and compare the latent representations of various network architectures trained to solve different supervised and self-supervised training tasks. We then examine factors affecting the network interpretability such as the number of the training iterations, regularizations, different initializations, and the network depth and width. Finally we show that the interpreted units can be used to provide explicit explanations of a prediction given by a CNN for an image. Our results highlight that interpretability is an important property of deep neural networks that provides new insights into their hierarchical structure.Comment: *B. Zhou and D. Bau contributed equally to this work. 15 pages, 27 figure

arXiv.org e-Print Archive

DSpace@MIT