Search CORE

36 research outputs found

Learning Generative Models with Visual Attention

Author: Salakhutdinov Ruslan
Srivastava Nitish
Tang Yichuan
Publication venue
Publication date: 21/02/2015
Field of study

Attention has long been proposed by psychologists as important for effectively dealing with the enormous sensory stimulus available in the neocortex. Inspired by the visual attention models in computational neuroscience and the need of object-centric data for generative models, we describe for generative learning framework using attentional mechanisms. Attentional mechanisms can propagate signals from region of interest in a scene to an aligned canonical representation, where generative modeling takes place. By ignoring background clutter, generative models can concentrate their resources on the object of interest. Our model is a proper graphical model where the 2D Similarity transformation is a part of the top-down process. A ConvNet is employed to provide good initializations during posterior inference which is based on Hamiltonian Monte Carlo. Upon learning images of faces, our model can robustly attend to face regions of novel test subjects. More importantly, our model can learn generative models of new faces from a novel dataset of large images where the face locations are not known.Comment: In the proceedings of Neural Information Processing Systems, 201

arXiv.org e-Print Archive

CiteSeerX

Robust Visual Recognition Using Multilayer Generative Neural Networks

Author: Tang Yichuan
Publication venue: 'University of Waterloo'
Publication date: 01/01/2010
Field of study

Deep generative neural networks such as the Deep Belief Network and Deep Boltzmann Machines have been used successfully to model high dimensional visual data. However, they are not robust to common variations such as occlusion and random noise. In this thesis, we explore two strategies for improving the robustness of DBNs. First, we show that a DBN with sparse connections in the first layer is more robust to variations that are not in the training set. Second, we develop a probabilistic denoising algorithm to determine a subset of the hidden layer nodes to unclamp. We show that this can be applied to any feedforward network classifier with localized first layer connections. By utilizing the already available generative model for denoising prior to recognition, we show significantly better performance over the standard DBN implementations for various sources of noise on the standard and Variations MNIST databases

University of Waterloo's Institutional Repository

Challenges in Representation Learning: A report on three machine learning contests

Author: Athanasakis Dimitris
Bengio Yoshua
Bergstra James
Carrier Pierre Luc
Chuang Zhang
Courville Aaron
Cukierski Will
Erhan Dumitru
Feng Fangxiang
Goodfellow Ian J.
Grozea Cristian
Hamner Ben
Ionescu Radu
Lee Dong-Hyun
Li Ruifan
Milakov Maxim
Mirza Mehdi
Park John
Popescu Marius
Ramaiah Chetan
Romaszko Lukasz
Shawe-Taylor John
Tang Yichuan
Thaler David
Wang Xiaojie
Xie Jingjing
Xu Bing
Zhou Yingbo
Publication venue
Publication date: 01/01/2013
Field of study

The ICML 2013 Workshop on Challenges in Representation Learning focused on three challenges: the black box learning challenge, the facial expression recognition challenge, and the multimodal learning challenge. We describe the datasets created for these challenges and summarize the results of the competitions. We provide suggestions for organizers of future challenges and some comments on what kind of knowledge can be gained from machine learning competitions.Comment: 8 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Fraunhofer-ePrints

Learning Generative Models Using Structured Latent Variables

Author: Tang Yichuan
Publication venue
Publication date: 01/11/2015
Field of study

Recent machine learning advances in computer vision and speech recognition have been largely driven by the application of supervised neural networks on large labeled datasets, leveraging effective regularization techniques and architectural design. Using more data and computational resources, performance is likely to continue to improve in the future. Despite these nice properties, supervised neural networks are sometimes criticized because their internal representations are opaque and lack the kind of interpretability that seems evident in human perception. For example, detecting a dog hidden in the bushes by looking at its exposed tail is a task that is not yet solved by discriminative neural networks. Another class of challenging tasks is one-shot learning, where only one training example for a new concept is available to the model for training. It is widely believed that learned prior knowledge must be utilized in order to tackle this problem. My dissertation tries to address some of these concerns by introducing domain-specific knowledge to standard deep learning models. This domain-specific knowledge is used to specify meaningful latent representation with structure, which forces the model to generalize better under certain scenarios. For example, a generative model with latent gating variables that will ``switch off" noisy pixels should perform better when encountering noise at test time. A generative model with latent variables representing 3D surface normal vectors should do better at modeling illumination variations. These are the kinds of relatively simple domain-specific modifications that we explore in this thesis. It is true that we should not rely too much on manual engineering and learn as much as possible. This principle is taken seriously and we strike a balance between using too much laborious engineering on the one hand and learning everything from scratch on the other. Adding structure to deep generative models is not only helpful for computer vision applications, but it is also very effective for unsupervised density learning tasks.Ph.D

University of Toronto Research Repository