Search CORE

27,103 research outputs found

Deep Elastic Networks with Model Selection for Multi-Task Learning

Author: Ahn Chanho
Kim Eunwoo
Oh Songhwai
Publication venue
Publication date: 01/02/2019
Field of study

In this work, we consider the problem of instance-wise dynamic network model selection for multi-task learning. To this end, we propose an efficient approach to exploit a compact but accurate model in a backbone architecture for each instance of all tasks. The proposed method consists of an estimator and a selector. The estimator is based on a backbone architecture and structured hierarchically. It can produce multiple different network models of different configurations in a hierarchical structure. The selector chooses a model dynamically from a pool of candidate models given an input instance. The selector is a relatively small-size network consisting of a few layers, which estimates a probability distribution over the candidate models when an input instance of a task is given. Both estimator and selector are jointly trained in a unified learning framework in conjunction with a sampling-based learning strategy, without additional computation steps. We demonstrate the proposed approach for several image classification tasks compared to existing approaches performing model selection or learning multiple tasks. Experimental results show that our approach gives not only outstanding performance compared to other competitors but also the versatility to perform instance-wise model selection for multiple tasks.Comment: ICCV 201

arXiv.org e-Print Archive

Crossref

SNU Open Repository and Archive

Deep Self-Taught Learning for Handwritten Character Recognition

Author: Bastien Frédéric
Bengio Yoshua
Bergeron Arnaud
Boulanger-Lewandowski Nicolas
Breuel Thomas
Chherawala Youssouf
Cisse Moustapha
Côté Myriam
Erhan Dumitru
Eustache Jeremy
Glorot Xavier
Lebeuf Sylvain Pannetier
Muller Xavier
Pascanu Razvan
Rifai Salah
Savard Francois
Sicard Guillaume
Publication venue
Publication date: 01/01/2010
Field of study

Recent theoretical and empirical work in statistical machine learning has demonstrated the importance of learning algorithms for deep architectures, i.e., function classes obtained by composing multiple non-linear transformations. Self-taught learning (exploiting unlabeled examples or examples from other distributions) has already been applied to deep learners, but mostly to show the advantage of unlabeled examples. Here we explore the advantage brought by {\em out-of-distribution examples}. For this purpose we developed a powerful generator of stochastic variations and noise processes for character images, including not only affine transformations but also slant, local elastic deformations, changes in thickness, background images, grey level changes, contrast, occlusion, and various types of noise. The out-of-distribution examples are obtained from these highly distorted images or by including examples of object classes different from those in the target test set. We show that {\em deep learners benefit more from out-of-distribution examples than a corresponding shallow learner}, at least in the area of handwritten character recognition. In fact, we show that they beat previously published results and reach human-level performance on both handwritten digit classification and 62-class handwritten character recognition

arXiv.org e-Print Archive

CiteSeerX

A scalable saliency-based Feature selection method with instance level information

Author: Alonso-Betanzos Amparo
Bolón-Canedo Verónica
Cancela Brais
Gama João
Publication venue: 'Elsevier BV'
Publication date: 30/04/2019
Field of study

Classic feature selection techniques remove those features that are either irrelevant or redundant, achieving a subset of relevant features that help to provide a better knowledge extraction. This allows the creation of compact models that are easier to interpret. Most of these techniques work over the whole dataset, but they are unable to provide the user with successful information when only instance information is needed. In short, given any example, classic feature selection algorithms do not give any information about which the most relevant information is, regarding this sample. This work aims to overcome this handicap by developing a novel feature selection method, called Saliency-based Feature Selection (SFS), based in deep-learning saliency techniques. Our experimental results will prove that this algorithm can be successfully used not only in Neural Networks, but also under any given architecture trained by using Gradient Descent techniques

arXiv.org e-Print Archive

Repositorio da Universidade da Coruña