41 research outputs found
Meta-Learning in Neural Networks: A Survey
The field of meta-learning, or learning-to-learn, has seen a dramatic rise in
interest in recent years. Contrary to conventional approaches to AI where tasks
are solved from scratch using a fixed learning algorithm, meta-learning aims to
improve the learning algorithm itself, given the experience of multiple
learning episodes. This paradigm provides an opportunity to tackle many
conventional challenges of deep learning, including data and computation
bottlenecks, as well as generalization. This survey describes the contemporary
meta-learning landscape. We first discuss definitions of meta-learning and
position it with respect to related fields, such as transfer learning and
hyperparameter optimization. We then propose a new taxonomy that provides a
more comprehensive breakdown of the space of meta-learning methods today. We
survey promising applications and successes of meta-learning such as few-shot
learning and reinforcement learning. Finally, we discuss outstanding challenges
and promising areas for future research
Robust Subjective Visual Property Prediction from Crowdsourced Pairwise Labels.
The problem of estimating subjective visual properties from image and video
has attracted increasing interest. A subjective visual property is useful
either on its own (e.g. image and video interestingness) or as an intermediate
representation for visual recognition (e.g. a relative attribute). Due to its
ambiguous nature, annotating the value of a subjective visual property for
learning a prediction model is challenging. To make the annotation more
reliable, recent studies employ crowdsourcing tools to collect pairwise
comparison labels because human annotators are much better at ranking two
images/videos (e.g. which one is more interesting) than giving an absolute
value to each of them separately. However, using crowdsourced data also
introduces outliers. Existing methods rely on majority voting to prune the
annotation outliers/errors. They thus require large amount of pairwise labels
to be collected. More importantly as a local outlier detection method, majority
voting is ineffective in identifying outliers that can cause global ranking
inconsistencies. In this paper, we propose a more principled way to identify
annotation outliers by formulating the subjective visual property prediction
task as a unified robust learning to rank problem, tackling both the outlier
detection and learning to rank jointly. Differing from existing methods, the
proposed method integrates local pairwise comparison labels together to
minimise a cost that corresponds to global inconsistency of ranking order. This
not only leads to better detection of annotation outliers but also enables
learning with extremely sparse annotations. Extensive experiments on various
benchmark datasets demonstrate that our new approach significantly outperforms
state-of-the-arts alternatives.Comment: 14 pages, accepted by IEEE TPAM
Structure Inference for Bayesian Multisensory Perception and Tracking
We investigate a solution to the problem of multisensor perception and tracking by formulating it in the framework of Bayesian model selection. Humans robustly associate multi-sensory data as appropriate, but previous theoretical work has focused largely on purely integrative cases, leaving segregation unaccounted for and unexploited by machine perception systems. We illustrate a unifying, Bayesian solution to multi-sensor perception and tracking which accounts for both integration and segregation by explicit probabilistic reasoning about data association in a temporal context. Unsupervised learning of such a model with EM is illustrated for a real world audio-visual application.
Multisensory Oddity Detection as Bayesian Inference
A key goal for the perceptual system is to optimally combine information from all the senses that may be available in order to develop the most accurate and unified picture possible of the outside world. The contemporary theoretical framework of ideal observer maximum likelihood integration (MLI) has been highly successful in modelling how the human brain combines information from a variety of different sensory modalities. However, in various recent experiments involving multisensory stimuli of uncertain correspondence, MLI breaks down as a successful model of sensory combination. Within the paradigm of direct stimulus estimation, perceptual models which use Bayesian inference to resolve correspondence have recently been shown to generalize successfully to these cases where MLI fails. This approach has been known variously as model inference, causal inference or structure inference. In this paper, we examine causal uncertainty in another important class of multi-sensory perception paradigm – that of oddity detection and demonstrate how a Bayesian ideal observer also treats oddity detection as a structure inference problem. We validate this approach by showing that it provides an intuitive and quantitative explanation of an important pair of multi-sensory oddity detection experiments – involving cues across and within modalities – for which MLI previously failed dramatically, allowing a novel unifying treatment of within and cross modal multisensory perception. Our successful application of structure inference models to the new ‘oddity detection’ paradigm, and the resultant unified explanation of across and within modality cases provide further evidence to suggest that structure inference may be a commonly evolved principle for combining perceptual information in the brain
A Comprehensive Model of Audiovisual Perception: Both Percept and Temporal Dynamics
The sparse information captured by the sensory systems is used by the brain to apprehend the environment, for example, to spatially locate the source of audiovisual stimuli. This is an ill-posed inverse problem whose inherent uncertainty can be solved by jointly processing the information, as well as introducing constraints during this process, on the way this multisensory information is handled. This process and its result - the percept - depend on the contextual conditions perception takes place in. To date, perception has been investigated and modeled on the basis of either one of two of its dimensions: the percept or the temporal dynamics of the process. Here, we extend our previously proposed audiovisual perception model to predict both these dimensions to capture the phenomenon as a whole. Starting from a behavioral analysis, we use a data-driven approach to elicit a Bayesian network which infers the different percepts and dynamics of the process. Context-specific independence analyses enable us to use the model's structure to directly explore how different contexts affect the way subjects handle the same available information. Hence, we establish that, while the percepts yielded by a unisensory stimulus or by the non-fusion of multisensory stimuli may be similar, they result from different processes, as shown by their differing temporal dynamics. Moreover, our model predicts the impact of bottom-up (stimulus driven) factors as well as of top-down factors (induced by instruction manipulation) on both the perception process and the percept itself
Free-hand sketch synthesis with deformable stroke models
We present a generative model which can automatically summarize the stroke
composition of free-hand sketches of a given category. When our model is fit to
a collection of sketches with similar poses, it discovers and learns the
structure and appearance of a set of coherent parts, with each part represented
by a group of strokes. It represents both consistent (topology) as well as
diverse aspects (structure and appearance variations) of each sketch category.
Key to the success of our model are important insights learned from a
comprehensive study performed on human stroke data. By fitting this model to
images, we are able to synthesize visually similar and pleasant free-hand
sketches