41 research outputs found

    Identifying Rare and Subtle Behaviors: A Weakly Supervised Joint Topic Model

    Get PDF

    Meta-Learning in Neural Networks: A Survey

    Get PDF
    The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent years. Contrary to conventional approaches to AI where tasks are solved from scratch using a fixed learning algorithm, meta-learning aims to improve the learning algorithm itself, given the experience of multiple learning episodes. This paradigm provides an opportunity to tackle many conventional challenges of deep learning, including data and computation bottlenecks, as well as generalization. This survey describes the contemporary meta-learning landscape. We first discuss definitions of meta-learning and position it with respect to related fields, such as transfer learning and hyperparameter optimization. We then propose a new taxonomy that provides a more comprehensive breakdown of the space of meta-learning methods today. We survey promising applications and successes of meta-learning such as few-shot learning and reinforcement learning. Finally, we discuss outstanding challenges and promising areas for future research

    Robust Subjective Visual Property Prediction from Crowdsourced Pairwise Labels.

    Get PDF
    The problem of estimating subjective visual properties from image and video has attracted increasing interest. A subjective visual property is useful either on its own (e.g. image and video interestingness) or as an intermediate representation for visual recognition (e.g. a relative attribute). Due to its ambiguous nature, annotating the value of a subjective visual property for learning a prediction model is challenging. To make the annotation more reliable, recent studies employ crowdsourcing tools to collect pairwise comparison labels because human annotators are much better at ranking two images/videos (e.g. which one is more interesting) than giving an absolute value to each of them separately. However, using crowdsourced data also introduces outliers. Existing methods rely on majority voting to prune the annotation outliers/errors. They thus require large amount of pairwise labels to be collected. More importantly as a local outlier detection method, majority voting is ineffective in identifying outliers that can cause global ranking inconsistencies. In this paper, we propose a more principled way to identify annotation outliers by formulating the subjective visual property prediction task as a unified robust learning to rank problem, tackling both the outlier detection and learning to rank jointly. Differing from existing methods, the proposed method integrates local pairwise comparison labels together to minimise a cost that corresponds to global inconsistency of ranking order. This not only leads to better detection of annotation outliers but also enables learning with extremely sparse annotations. Extensive experiments on various benchmark datasets demonstrate that our new approach significantly outperforms state-of-the-arts alternatives.Comment: 14 pages, accepted by IEEE TPAM

    Structure Inference for Bayesian Multisensory Perception and Tracking

    Get PDF
    We investigate a solution to the problem of multisensor perception and tracking by formulating it in the framework of Bayesian model selection. Humans robustly associate multi-sensory data as appropriate, but previous theoretical work has focused largely on purely integrative cases, leaving segregation unaccounted for and unexploited by machine perception systems. We illustrate a unifying, Bayesian solution to multi-sensor perception and tracking which accounts for both integration and segregation by explicit probabilistic reasoning about data association in a temporal context. Unsupervised learning of such a model with EM is illustrated for a real world audio-visual application.

    Multisensory Oddity Detection as Bayesian Inference

    Get PDF
    A key goal for the perceptual system is to optimally combine information from all the senses that may be available in order to develop the most accurate and unified picture possible of the outside world. The contemporary theoretical framework of ideal observer maximum likelihood integration (MLI) has been highly successful in modelling how the human brain combines information from a variety of different sensory modalities. However, in various recent experiments involving multisensory stimuli of uncertain correspondence, MLI breaks down as a successful model of sensory combination. Within the paradigm of direct stimulus estimation, perceptual models which use Bayesian inference to resolve correspondence have recently been shown to generalize successfully to these cases where MLI fails. This approach has been known variously as model inference, causal inference or structure inference. In this paper, we examine causal uncertainty in another important class of multi-sensory perception paradigm – that of oddity detection and demonstrate how a Bayesian ideal observer also treats oddity detection as a structure inference problem. We validate this approach by showing that it provides an intuitive and quantitative explanation of an important pair of multi-sensory oddity detection experiments – involving cues across and within modalities – for which MLI previously failed dramatically, allowing a novel unifying treatment of within and cross modal multisensory perception. Our successful application of structure inference models to the new ‘oddity detection’ paradigm, and the resultant unified explanation of across and within modality cases provide further evidence to suggest that structure inference may be a commonly evolved principle for combining perceptual information in the brain

    A Comprehensive Model of Audiovisual Perception: Both Percept and Temporal Dynamics

    Get PDF
    The sparse information captured by the sensory systems is used by the brain to apprehend the environment, for example, to spatially locate the source of audiovisual stimuli. This is an ill-posed inverse problem whose inherent uncertainty can be solved by jointly processing the information, as well as introducing constraints during this process, on the way this multisensory information is handled. This process and its result - the percept - depend on the contextual conditions perception takes place in. To date, perception has been investigated and modeled on the basis of either one of two of its dimensions: the percept or the temporal dynamics of the process. Here, we extend our previously proposed audiovisual perception model to predict both these dimensions to capture the phenomenon as a whole. Starting from a behavioral analysis, we use a data-driven approach to elicit a Bayesian network which infers the different percepts and dynamics of the process. Context-specific independence analyses enable us to use the model's structure to directly explore how different contexts affect the way subjects handle the same available information. Hence, we establish that, while the percepts yielded by a unisensory stimulus or by the non-fusion of multisensory stimuli may be similar, they result from different processes, as shown by their differing temporal dynamics. Moreover, our model predicts the impact of bottom-up (stimulus driven) factors as well as of top-down factors (induced by instruction manipulation) on both the perception process and the percept itself

    Free-hand sketch synthesis with deformable stroke models

    Get PDF
    We present a generative model which can automatically summarize the stroke composition of free-hand sketches of a given category. When our model is fit to a collection of sketches with similar poses, it discovers and learns the structure and appearance of a set of coherent parts, with each part represented by a group of strokes. It represents both consistent (topology) as well as diverse aspects (structure and appearance variations) of each sketch category. Key to the success of our model are important insights learned from a comprehensive study performed on human stroke data. By fitting this model to images, we are able to synthesize visually similar and pleasant free-hand sketches
    corecore