429 research outputs found
Invariances of random fields paths, with applications in Gaussian Process Regression
We study pathwise invariances of centred random fields that can be controlled
through the covariance. A result involving composition operators is obtained in
second-order settings, and we show that various path properties including
additivity boil down to invariances of the covariance kernel. These results are
extended to a broader class of operators in the Gaussian case, via the Lo\`eve
isometry. Several covariance-driven pathwise invariances are illustrated,
including fields with symmetric paths, centred paths, harmonic paths, or sparse
paths. The proposed approach delivers a number of promising results and
perspectives in Gaussian process regression
Learning to Extract Motion from Videos in Convolutional Neural Networks
This paper shows how to extract dense optical flow from videos with a
convolutional neural network (CNN). The proposed model constitutes a potential
building block for deeper architectures to allow using motion without resorting
to an external algorithm, \eg for recognition in videos. We derive our network
architecture from signal processing principles to provide desired invariances
to image contrast, phase and texture. We constrain weights within the network
to enforce strict rotation invariance and substantially reduce the number of
parameters to learn. We demonstrate end-to-end training on only 8 sequences of
the Middlebury dataset, orders of magnitude less than competing CNN-based
motion estimation methods, and obtain comparable performance to classical
methods on the Middlebury benchmark. Importantly, our method outputs a
distributed representation of motion that allows representing multiple,
transparent motions, and dynamic textures. Our contributions on network design
and rotation invariance offer insights nonspecific to motion estimation
Accelerating decision making under partial observability using learned action priors
Thesis (M.Sc.)--University of the Witwatersrand, Faculty of Science, School of Computer Science and Applied Mathematics, 2017.Partially Observable Markov Decision Processes (POMDPs) provide a principled mathematical
framework allowing a robot to reason about the consequences of actions and
observations with respect to the agent's limited perception of its environment. They
allow an agent to plan and act optimally in uncertain environments. Although they
have been successfully applied to various robotic tasks, they are infamous for their high
computational cost. This thesis demonstrates the use of knowledge transfer, learned
from previous experiences, to accelerate the learning of POMDP tasks. We propose
that in order for an agent to learn to solve these tasks quicker, it must be able to generalise
from past behaviours and transfer knowledge, learned from solving multiple tasks,
between di erent circumstances. We present a method for accelerating this learning
process by learning the statistics of action choices over the lifetime of an agent, known
as action priors. Action priors specify the usefulness of actions in situations and allow
us to bias exploration, which in turn improves the performance of the learning process.
Using navigation domains, we study the degree to which transferring knowledge
between tasks in this way results in a considerable speed up in solution times.
This thesis therefore makes the following contributions. We provide an algorithm
for learning action priors from a set of approximately optimal value functions and two
approaches with which a prior knowledge over actions can be used in a POMDP context.
As such, we show that considerable gains in speed can be achieved in learning subsequent
tasks using prior knowledge rather than learning from scratch. Learning with
action priors can particularly be useful in reducing the cost of exploration in the early
stages of the learning process as the priors can act as mechanism that allows the agent
to select more useful actions given particular circumstances. Thus, we demonstrate how
the initial losses associated with unguided exploration can be alleviated through the
use of action priors which allow for safer exploration. Additionally, we illustrate that
action priors can also improve the computation speeds of learning feasible policies in a
shorter period of time.MT201
Understanding structure of concurrent actions
Whereas most work in reinforcement learning (RL) ignores the structure or relationships between actions, in this paper we show that exploiting structure in the action space can improve sample efficiency during exploration. To show this we focus on concurrent action spaces where the RL agent selects multiple actions per timestep. Concurrent action spaces are challenging to learn in especially if the number of actions is large as this can lead to a combinatorial explosion of the action space.
This paper proposes two methods: a first approach uses implicit structure to perform high-level action elimination using task-invariant actions; a second approach looks for more explicit structure in the form of action clusters. Both methods are context-free, focusing only on an analysis of the action space and show a significant improvement in policy convergence times
Time-Contrastive Networks: Self-Supervised Learning from Video
We propose a self-supervised approach for learning representations and
robotic behaviors entirely from unlabeled videos recorded from multiple
viewpoints, and study how this representation can be used in two robotic
imitation settings: imitating object interactions from videos of humans, and
imitating human poses. Imitation of human behavior requires a
viewpoint-invariant representation that captures the relationships between
end-effectors (hands or robot grippers) and the environment, object attributes,
and body pose. We train our representations using a metric learning loss, where
multiple simultaneous viewpoints of the same observation are attracted in the
embedding space, while being repelled from temporal neighbors which are often
visually similar but functionally different. In other words, the model
simultaneously learns to recognize what is common between different-looking
images, and what is different between similar-looking images. This signal
causes our model to discover attributes that do not change across viewpoint,
but do change across time, while ignoring nuisance variables such as
occlusions, motion blur, lighting and background. We demonstrate that this
representation can be used by a robot to directly mimic human poses without an
explicit correspondence, and that it can be used as a reward function within a
reinforcement learning algorithm. While representations are learned from an
unlabeled collection of task-related videos, robot behaviors such as pouring
are learned by watching a single 3rd-person demonstration by a human. Reward
functions obtained by following the human demonstrations under the learned
representation enable efficient reinforcement learning that is practical for
real-world robotic systems. Video results, open-source code and dataset are
available at https://sermanet.github.io/imitat
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
- …