20,839 research outputs found
Deep Recurrent Learning for Efficient Image Recognition Using Small Data
Recognition is fundamental yet open and challenging problem in computer vision. Recognition involves the detection and interpretation of complex shapes of objects or persons from previous encounters or knowledge. Biological systems are considered as the most powerful, robust and generalized recognition models. The recent success of learning based mathematical models known as artificial neural networks, especially deep neural networks, have propelled researchers to utilize such architectures for developing bio-inspired computational recognition models. However, the computational complexity of these models increases proportionally to the challenges posed by the recognition problem, and more importantly, these models require a large amount of data for successful learning. Additionally, the feedforward-based hierarchical models do not exploit another important biological learning paradigm, known as recurrency, which ubiquitously exists in the biological visual system and has been shown to be quite crucial for recognition.
Consequently, this work aims to develop novel biologically relevant deep recurrent learning models for robust recognition using limited training data. First, we design an efficient deep simultaneous recurrent network (DSRN) architecture for solving several challenging image recognition tasks. The use of simultaneous recurrency in the proposed model improves the recognition performance and offers reduced computational complexity compared to the existing hierarchical deep learning models. Moreover, the DSRN architecture inherently learns meaningful representations of data during the training process which is essential to achieve superior recognition performance. However, probabilistic models such as deep generative models are particularly adept at learning representations directly from unlabeled input data. Accordingly, we show the generalization of the proposed deep simultaneous recurrency concept by developing a probabilistic deep simultaneous recurrent belief network (DSRBN) architecture which is more efficient in learning the underlying representation of the data compared to the state-of-the-art generative models. Finally, we propose a deep recurrent learning framework for solving the image recognition task using small data. We incorporate Bayesian statistics to the DSRBN generative model to propose a deep recurrent generative Bayesian model that addresses the challenge of learning from a small amount of data. Our findings suggest that the proposed deep recurrent Bayesian framework demonstrates better image recognition performance compared to the state-of-the-art models in a small data learning scenario. In conclusion, this dissertation proposes novel deep recurrent learning pipelines, which utilize not only limited training data to achieve improved image recognition performance but also require significantly reduced training parameters
Learning and Reasoning for Robot Sequential Decision Making under Uncertainty
Robots frequently face complex tasks that require more than one action, where
sequential decision-making (SDM) capabilities become necessary. The key
contribution of this work is a robot SDM framework, called LCORPP, that
supports the simultaneous capabilities of supervised learning for passive state
estimation, automated reasoning with declarative human knowledge, and planning
under uncertainty toward achieving long-term goals. In particular, we use a
hybrid reasoning paradigm to refine the state estimator, and provide
informative priors for the probabilistic planner. In experiments, a mobile
robot is tasked with estimating human intentions using their motion
trajectories, declarative contextual knowledge, and human-robot interaction
(dialog-based and motion-based). Results suggest that, in efficiency and
accuracy, our framework performs better than its no-learning and no-reasoning
counterparts in office environment.Comment: In proceedings of 34th AAAI conference on Artificial Intelligence,
202
Deeply Learning the Messages in Message Passing Inference
Deep structured output learning shows great promise in tasks like semantic
image segmentation. We proffer a new, efficient deep structured model learning
scheme, in which we show how deep Convolutional Neural Networks (CNNs) can be
used to estimate the messages in message passing inference for structured
prediction with Conditional Random Fields (CRFs). With such CNN message
estimators, we obviate the need to learn or evaluate potential functions for
message calculation. This confers significant efficiency for learning, since
otherwise when performing structured learning for a CRF with CNN potentials it
is necessary to undertake expensive inference for every stochastic gradient
iteration. The network output dimension for message estimation is the same as
the number of classes, in contrast to the network output for general CNN
potential functions in CRFs, which is exponential in the order of the
potentials. Hence CNN message learning has fewer network parameters and is more
scalable for cases that a large number of classes are involved. We apply our
method to semantic image segmentation on the PASCAL VOC 2012 dataset. We
achieve an intersection-over-union score of 73.4 on its test set, which is the
best reported result for methods using the VOC training images alone. This
impressive performance demonstrates the effectiveness and usefulness of our CNN
message learning method.Comment: 11 pages. Appearing in Proc. The Twenty-ninth Annual Conference on
Neural Information Processing Systems (NIPS), 2015, Montreal, Canad
Recommended from our members
Neurons and symbols: a manifesto
We discuss the purpose of neural-symbolic integration including its principles, mechanisms and applications. We outline a cognitive computational model for neural-symbolic integration, position the model in the broader context of multi-agent systems, machine learning and automated reasoning, and list some of the challenges for the area of
neural-symbolic computation to achieve the promise of effective integration of robust learning and expressive reasoning under uncertainty
Deep Learning Techniques for Music Generation -- A Survey
This paper is a survey and an analysis of different ways of using deep
learning (deep artificial neural networks) to generate musical content. We
propose a methodology based on five dimensions for our analysis:
Objective - What musical content is to be generated? Examples are: melody,
polyphony, accompaniment or counterpoint. - For what destination and for what
use? To be performed by a human(s) (in the case of a musical score), or by a
machine (in the case of an audio file).
Representation - What are the concepts to be manipulated? Examples are:
waveform, spectrogram, note, chord, meter and beat. - What format is to be
used? Examples are: MIDI, piano roll or text. - How will the representation be
encoded? Examples are: scalar, one-hot or many-hot.
Architecture - What type(s) of deep neural network is (are) to be used?
Examples are: feedforward network, recurrent network, autoencoder or generative
adversarial networks.
Challenge - What are the limitations and open challenges? Examples are:
variability, interactivity and creativity.
Strategy - How do we model and control the process of generation? Examples
are: single-step feedforward, iterative feedforward, sampling or input
manipulation.
For each dimension, we conduct a comparative analysis of various models and
techniques and we propose some tentative multidimensional typology. This
typology is bottom-up, based on the analysis of many existing deep-learning
based systems for music generation selected from the relevant literature. These
systems are described and are used to exemplify the various choices of
objective, representation, architecture, challenge and strategy. The last
section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P.
Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music
Generation, Computational Synthesis and Creative Systems, Springer, 201
- …