27,103 research outputs found
Deep Elastic Networks with Model Selection for Multi-Task Learning
In this work, we consider the problem of instance-wise dynamic network model
selection for multi-task learning. To this end, we propose an efficient
approach to exploit a compact but accurate model in a backbone architecture for
each instance of all tasks. The proposed method consists of an estimator and a
selector. The estimator is based on a backbone architecture and structured
hierarchically. It can produce multiple different network models of different
configurations in a hierarchical structure. The selector chooses a model
dynamically from a pool of candidate models given an input instance. The
selector is a relatively small-size network consisting of a few layers, which
estimates a probability distribution over the candidate models when an input
instance of a task is given. Both estimator and selector are jointly trained in
a unified learning framework in conjunction with a sampling-based learning
strategy, without additional computation steps. We demonstrate the proposed
approach for several image classification tasks compared to existing approaches
performing model selection or learning multiple tasks. Experimental results
show that our approach gives not only outstanding performance compared to other
competitors but also the versatility to perform instance-wise model selection
for multiple tasks.Comment: ICCV 201
Deep Self-Taught Learning for Handwritten Character Recognition
Recent theoretical and empirical work in statistical machine learning has
demonstrated the importance of learning algorithms for deep architectures,
i.e., function classes obtained by composing multiple non-linear
transformations. Self-taught learning (exploiting unlabeled examples or
examples from other distributions) has already been applied to deep learners,
but mostly to show the advantage of unlabeled examples. Here we explore the
advantage brought by {\em out-of-distribution examples}. For this purpose we
developed a powerful generator of stochastic variations and noise processes for
character images, including not only affine transformations but also slant,
local elastic deformations, changes in thickness, background images, grey level
changes, contrast, occlusion, and various types of noise. The
out-of-distribution examples are obtained from these highly distorted images or
by including examples of object classes different from those in the target test
set. We show that {\em deep learners benefit more from out-of-distribution
examples than a corresponding shallow learner}, at least in the area of
handwritten character recognition. In fact, we show that they beat previously
published results and reach human-level performance on both handwritten digit
classification and 62-class handwritten character recognition
A scalable saliency-based Feature selection method with instance level information
Classic feature selection techniques remove those features that are either
irrelevant or redundant, achieving a subset of relevant features that help to
provide a better knowledge extraction. This allows the creation of compact
models that are easier to interpret. Most of these techniques work over the
whole dataset, but they are unable to provide the user with successful
information when only instance information is needed. In short, given any
example, classic feature selection algorithms do not give any information about
which the most relevant information is, regarding this sample. This work aims
to overcome this handicap by developing a novel feature selection method,
called Saliency-based Feature Selection (SFS), based in deep-learning saliency
techniques. Our experimental results will prove that this algorithm can be
successfully used not only in Neural Networks, but also under any given
architecture trained by using Gradient Descent techniques
- …