Search CORE

121 research outputs found

Parametrizing filters of a CNN with a GAN

Author: Becigneul Gary
Hofmann Thomas
Kilcher Yannic
Publication venue
Publication date: 31/10/2017
Field of study

It is commonly agreed that the use of relevant invariances as a good statistical bias is important in machine-learning. However, most approaches that explicitly incorporate invariances into a model architecture only make use of very simple transformations, such as translations and rotations. Hence, there is a need for methods to model and extract richer transformations that capture much higher-level invariances. To that end, we introduce a tool allowing to parametrize the set of filters of a trained convolutional neural network with the latent space of a generative adversarial network. We then show that the method can capture highly non-linear invariances of the data by visualizing their effect in the data space

arXiv.org e-Print Archive

HyperNets and their application to learning spatial transformations

Author: Potapov Alexey
Rodionov Sergey
Shcherbakov Oleg
Skorobogatko Nikolai
Zhdanov Innokentii
Publication venue
Publication date: 12/07/2018
Field of study

In this paper we propose a conceptual framework for higher-order artificial neural networks. The idea of higher-order networks arises naturally when a model is required to learn some group of transformations, every element of which is well-approximated by a traditional feedforward network. Thus the group as a whole can be represented as a hyper network. One of typical examples of such groups is spatial transformations. We show that the proposed framework, which we call HyperNets, is able to deal with at least two basic spatial transformations of images: rotation and affine transformation. We show that HyperNets are able not only to generalize rotation and affine transformation, but also to compensate the rotation of images bringing them into canonical forms

arXiv.org e-Print Archive

Learning Good Representation via Continuous Attention

Author: Xu Wei
Zhao Liang
Publication venue
Publication date: 01/04/2019
Field of study

In this paper we present our scientific discovery that good representation can be learned via continuous attention during the interaction between Unsupervised Learning(UL) and Reinforcement Learning(RL) modules driven by intrinsic motivation. Specifically, we designed intrinsic rewards generated from UL modules for driving the RL agent to focus on objects for a period of time and to learn good representations of objects for later object recognition task. We evaluate our proposed algorithm in both with and without extrinsic reward settings. Experiments with end-to-end training in simulated environments with applications to few-shot object recognition demonstrated the effectiveness of the proposed algorithm

arXiv.org e-Print Archive

Unsupervised Pretraining Encourages Moderate-Sparseness

Author: Li Jun
Luo Wei
Yang Jian
Yuan Xiaotong
Publication venue
Publication date: 09/06/2014
Field of study

It is well known that direct training of deep neural networks will generally lead to poor results. A major progress in recent years is the invention of various pretraining methods to initialize network parameters and it was shown that such methods lead to good prediction performance. However, the reason for the success of pretraining has not been fully understood, although it was argued that regularization and better optimization play certain roles. This paper provides another explanation for the effectiveness of pretraining, where we show pretraining leads to a sparseness of hidden unit activation in the resulting neural networks. The main reason is that the pretraining models can be interpreted as an adaptive sparse coding. Compared to deep neural network with sigmoid function, our experimental results on MNIST and Birdsong further support this sparseness observation.Comment: 6 pages, 2 figures, (to appear) ICML-Workshop on Unsupervised Learning from Bioacoustic Big Data (uLearnBio) 201

arXiv.org e-Print Archive

Auto-pooling: Learning to Improve Invariance of Image Features from Image Sequences

Author: Aihara Kazuyuki
Makino Takaki
Sukhbaatar Sainbayar
Publication venue
Publication date: 18/03/2013
Field of study

Learning invariant representations from images is one of the hardest challenges facing computer vision. Spatial pooling is widely used to create invariance to spatial shifting, but it is restricted to convolutional models. In this paper, we propose a novel pooling method that can learn soft clustering of features from image sequences. It is trained to improve the temporal coherence of features, while keeping the information loss at minimum. Our method does not use spatial information, so it can be used with non-convolutional models too. Experiments on images extracted from natural videos showed that our method can cluster similar features together. When trained by convolutional features, auto-pooling outperformed traditional spatial pooling on an image classification task, even though it does not use the spatial topology of features.Comment: 9 pages, 10 figures. Submission for ICLR 201

arXiv.org e-Print Archive

Unsupervised Learning Layers for Video Analysis

Author: Wang Yang
Xu Wei
Yang Yi
Zhao Liang
Publication venue
Publication date: 24/05/2017
Field of study

This paper presents two unsupervised learning layers (UL layers) for label-free video analysis: one for fully connected layers, and the other for convolutional ones. The proposed UL layers can play two roles: they can be the cost function layer for providing global training signal; meanwhile they can be added to any regular neural network layers for providing local training signals and combined with the training signals backpropagated from upper layers for extracting both slow and fast changing features at layers of different depths. Therefore, the UL layers can be used in either pure unsupervised or semi-supervised settings. Both a closed-form solution and an online learning algorithm for two UL layers are provided. Experiments with unlabeled synthetic and real-world videos demonstrated that the neural networks equipped with UL layers and trained with the proposed online learning algorithm can extract shape and motion information from video sequences of moving objects. The experiments demonstrated the potential applications of UL layers and online learning algorithm to head orientation estimation and moving object localization

arXiv.org e-Print Archive

Analyzing noise in autoencoders and deep networks

Author: Ganguli Surya
Poole Ben
Sohl-Dickstein Jascha
Publication venue
Publication date: 06/06/2014
Field of study

Autoencoders have emerged as a useful framework for unsupervised learning of internal representations, and a wide variety of apparently conceptually disparate regularization techniques have been proposed to generate useful features. Here we extend existing denoising autoencoders to additionally inject noise before the nonlinearity, and at the hidden unit activations. We show that a wide variety of previous methods, including denoising, contractive, and sparse autoencoders, as well as dropout can be interpreted using this framework. This noise injection framework reaps practical benefits by providing a unified strategy to develop new internal representations by designing the nature of the injected noise. We show that noisy autoencoders outperform denoising autoencoders at the very task of denoising, and are competitive with other single-layer techniques on MNIST, and CIFAR-10. We also show that types of noise other than dropout improve performance in a deep network through sparsifying, decorrelating, and spreading information across representations

arXiv.org e-Print Archive

Application of Deep Learning on Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

Author: Chen Lei
Hu Zhihong
Jaitly Vanya
Kanaan Zeyad
Lin Mei
Nguyen Andy N. D.
Rios Adan
Wahed Md. Amer
Wang Iris
Publication venue
Publication date: 30/10/2018
Field of study

We explore how Deep Learning (DL) can be utilized to predict prognosis of acute myeloid leukemia (AML). Out of TCGA (The Cancer Genome Atlas) database, 94 AML cases are used in this study. Input data include age, 10 common cytogenetic and 23 most common mutation results; output is the prognosis (diagnosis to death, DTD). In our DL network, autoencoders are stacked to form a hierarchical DL model from which raw data are compressed and organized and high-level features are extracted. The network is written in R language and is designed to predict prognosis of AML for a given case (DTD of more than or less than 730 days). The DL network achieves an excellent accuracy of 83% in predicting prognosis. As a proof-of-concept study, our preliminary results demonstrate a practical application of DL in future practice of prognostic prediction using next-gen sequencing (NGS) data.Comment: 11 pages, 1 table, 1 figure. arXiv admin note: substantial text overlap with arXiv:1801.0101

arXiv.org e-Print Archive

Theta-RBM: Unfactored Gated Restricted Boltzmann Machine for Rotation-Invariant Representations

Author: Giuffrida Mario Valerio
Tsaftaris Sotirios A.
Publication venue
Publication date: 29/06/2016
Field of study

Learning invariant representations is a critical task in computer vision. In this paper, we propose the Theta-Restricted Boltzmann Machine ({\theta}-RBM in short), which builds upon the original RBM formulation and injects the notion of rotation-invariance during the learning procedure. In contrast to previous approaches, we do not transform the training set with all possible rotations. Instead, we rotate the gradient filters when they are computed during the Contrastive Divergence algorithm. We formulate our model as an unfactored gated Boltzmann machine, where another input layer is used to modulate the input visible layer to drive the optimisation procedure. Among our contributions is a mathematical proof that demonstrates that {\theta}-RBM is able to learn rotation-invariant features according to a recently proposed invariance measure. Our method reaches an invariance score of ~90% on mnist-rot dataset, which is the highest result compared with the baseline methods and the current state of the art in transformation-invariant feature learning in RBM. Using an SVM classifier, we also showed that our network learns discriminative features as well, obtaining ~10% of testing error.Comment: 9 pages, 2 figures, 3 table

arXiv.org e-Print Archive

Deep Convolutional Inverse Graphics Network

Author: Kohli Pushmeet
Kulkarni Tejas D.
Tenenbaum Joshua B.
Whitney Will
Publication venue
Publication date: 21/06/2015
Field of study

This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that learns an interpretable representation of images. This representation is disentangled with respect to transformations such as out-of-plane rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm. We propose a training procedure to encourage neurons in the graphics code layer to represent a specific transformation (e.g. pose or light). Given a single input image, our model can generate new images of the same object with variations in pose and lighting. We present qualitative and quantitative results of the model's efficacy at learning a 3D rendering engine.Comment: First two authors contributed equall

arXiv.org e-Print Archive