Search CORE

37,580 research outputs found

Joint Visual Denoising and Classification using Deep Learning

Author: Chen Gang
Li Yawei
Srihari Sargur N.
Publication venue
Publication date: 04/12/2016
Field of study

Visual restoration and recognition are traditionally addressed in pipeline fashion, i.e. denoising followed by classification. Instead, observing correlations between the two tasks, for example clearer image will lead to better categorization and vice visa, we propose a joint framework for visual restoration and recognition for handwritten images, inspired by advances in deep autoencoder and multi-modality learning. Our model is a 3-pathway deep architecture with a hidden-layer representation which is shared by multi-inputs and outputs, and each branch can be composed of a multi-layer deep model. Thus, visual restoration and classification can be unified using shared representation via non-linear mapping, and model parameters can be learnt via backpropagation. Using MNIST and USPS data corrupted with structured noise, the proposed framework performs at least 20\% better in classification than separate pipelines, as well as clearer recovered images. The noise model and the reproducible source code is available at {\url{https://github.com/ganggit/jointmodel}}.Comment: 5 pages, 7 figures, ICIP 201

arXiv.org e-Print Archive

Crossref

Set Aggregation Network as a Trainable Pooling Layer

Author: Maziarka Łukasz
Nowak Aleksandra
Spurek Przemysław
Struski Łukasz
Tabor Jacek
Śmieja Marek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Global pooling, such as max- or sum-pooling, is one of the key ingredients in deep neural networks used for processing images, texts, graphs and other types of structured data. Based on the recent DeepSets architecture proposed by Zaheer et al. (NIPS 2017), we introduce a Set Aggregation Network (SAN) as an alternative global pooling layer. In contrast to typical pooling operators, SAN allows to embed a given set of features to a vector representation of arbitrary size. We show that by adjusting the size of embedding, SAN is capable of preserving the whole information from the input. In experiments, we demonstrate that replacing global pooling layer by SAN leads to the improvement of classification accuracy. Moreover, it is less prone to overfitting and can be used as a regularizer.Comment: ICONIP 201

arXiv.org e-Print Archive

Jagiellonian Univeristy Repository