126,972 research outputs found
Transfer Value Iteration Networks
Value iteration networks (VINs) have been demonstrated to have a good
generalization ability for reinforcement learning tasks across similar domains.
However, based on our experiments, a policy learned by VINs still fail to
generalize well on the domain whose action space and feature space are not
identical to those in the domain where it is trained. In this paper, we propose
a transfer learning approach on top of VINs, termed Transfer VINs (TVINs), such
that a learned policy from a source domain can be generalized to a target
domain with only limited training data, even if the source domain and the
target domain have domain-specific actions and features. We empirically verify
that our proposed TVINs outperform VINs when the source and the target domains
have similar but not identical action and feature spaces. Furthermore, we show
that the performance improvement is consistent across different environments,
maze sizes, dataset sizes as well as different values of hyperparameters such
as number of iteration and kernel size
Improving Deep Reinforcement Learning Using Graph Convolution and Visual Domain Transfer
Recent developments in Deep Reinforcement Learning (DRL) have shown tremendous progress in robotics control, Atari games, board games such as Go, etc. However, model free DRL still has limited use cases due to its poor sampling efficiency and generalization on a variety of tasks. In this thesis, two particular drawbacks of DRL are investigated: 1) the poor generalization abilities of model free DRL. More specifically, how to generalize an agent\u27s policy to unseen environments and generalize to task performance on different data representations (e.g. image based or graph based) 2) The reality gap issue in DRL. That is, how to effectively transfer a policy learned in a simulator to the real world. This thesis makes several novel contributions to the field of DRL which are outlined sequentially in the following. Among these contributions is the generalized value iteration network (GVIN) algorithm, which is an end-to-end neural network planning module extending the work of Value Iteration Networks (VIN). GVIN emulates the value iteration algorithm by using a novel graph convolution operator, which enables GVIN to learn and plan on irregular spatial graphs. Additionally, this thesis proposes three novel, differentiable kernels as graph convolution operators and shows that the embedding-based kernel achieves the best performance. Furthermore, an improvement upon traditional -step -learning that stabilizes training for VIN and GVIN is demonstrated. Additionally, the equivalence between GVIN and graph neural networks is outlined and shown that GVIN can be further extended to address both control and inference problems. The final subject which falls under the graph domain that is studied in this thesis is graph embeddings. Specifically, this work studies a general graph embedding framework GEM-F that unifies most of the previous graph embedding algorithms. Based on the contributions made during the analysis of GEM-F, a novel algorithm called WarpMap which outperforms DeepWalk and node2vec in the unsupervised learning settings is proposed. The aforementioned reality gap in DRL prohibits a significant portion of research from reaching the real world setting. The latter part of this work studies and analyzes domain transfer techniques in an effort to bridge this gap. Typically, domain transfer in RL consists of representation transfer and policy transfer. In this work, the focus is on representation transfer for vision based applications. More specifically, aligning the feature representation from source domain to target domain in an unsupervised fashion. In this approach, a linear mapping function is considered to fuse modules that are trained in different domains. Proposed are two improved adversarial learning methods to enhance the training quality of the mapping function. Finally, the thesis demonstrates the effectiveness of domain alignment among different weather conditions in the CARLA autonomous driving simulator
Synchronisation effects on the behavioural performance and information dynamics of a simulated minimally cognitive robotic agent
Oscillatory activity is ubiquitous in nervous systems, with solid evidence that synchronisation mechanisms underpin cognitive processes. Nevertheless, its informational content and relationship with behaviour are still to be fully understood. In addition, cognitive systems cannot be properly appreciated without taking into account brainābodyā environment interactions. In this paper, we developed a model based on the Kuramoto Model of coupled phase oscillators to explore the role of neural synchronisation in the performance of a simulated robotic agent in two different minimally cognitive tasks. We show that there is a statistically significant difference in performance and evolvability depending on the synchronisation regime of the network. In both tasks, a combination of information flow and dynamical analyses show that networks with a definite, but not too strong, propensity for synchronisation are more able to reconfigure, to organise themselves functionally and to adapt to different behavioural conditions. The results highlight the asymmetry of information flow and its behavioural correspondence. Importantly, it also shows that neural synchronisation dynamics, when suitably flexible and reconfigurable, can generate minimally cognitive embodied behaviour
Grounding Language for Transfer in Deep Reinforcement Learning
In this paper, we explore the utilization of natural language to drive
transfer for reinforcement learning (RL). Despite the wide-spread application
of deep RL techniques, learning generalized policy representations that work
across domains remains a challenging problem. We demonstrate that textual
descriptions of environments provide a compact intermediate channel to
facilitate effective policy transfer. Specifically, by learning to ground the
meaning of text to the dynamics of the environment such as transitions and
rewards, an autonomous agent can effectively bootstrap policy learning on a new
domain given its description. We employ a model-based RL approach consisting of
a differentiable planning module, a model-free component and a factorized state
representation to effectively use entity descriptions. Our model outperforms
prior work on both transfer and multi-task scenarios in a variety of different
environments. For instance, we achieve up to 14% and 11.5% absolute improvement
over previously existing models in terms of average and initial rewards,
respectively.Comment: JAIR 201
A Gossip Algorithm based Clock Synchronization Scheme for Smart Grid Applications
The uprising interest in multi-agent based networked system, and the numerous
number of applications in the distributed control of the smart grid leads us to
address the problem of time synchronization in the smart grid. Utility
companies look for new packet based time synchronization solutions with Global
Positioning System (GPS) level accuracies beyond traditional packet methods
such as Network Time Proto- col (NTP). However GPS based solutions have poor
reception in indoor environments and dense urban canyons as well as GPS antenna
installation might be costly. Some smart grid nodes such as Phasor Measurement
Units (PMUs), fault detection, Wide Area Measurement Systems (WAMS) etc.,
requires synchronous accuracy as low as 1 ms. On the other hand, 1 sec accuracy
is acceptable in management information domain. Acknowledging this, in this
study, we introduce gossip algorithm based clock synchronization method among
network entities from the decision control and communication point of view. Our
method synchronizes clock within dense network with a bandwidth limited
environment. Our technique has been tested in different kinds of network
topologies- complete, star and random geometric network and demonstrated
satisfactory performance
- ā¦