124 research outputs found
Task Transfer by Preference-Based Cost Learning
The goal of task transfer in reinforcement learning is migrating the action
policy of an agent to the target task from the source task. Given their
successes on robotic action planning, current methods mostly rely on two
requirements: exactly-relevant expert demonstrations or the explicitly-coded
cost function on target task, both of which, however, are inconvenient to
obtain in practice. In this paper, we relax these two strong conditions by
developing a novel task transfer framework where the expert preference is
applied as a guidance. In particular, we alternate the following two steps:
Firstly, letting experts apply pre-defined preference rules to select related
expert demonstrates for the target task. Secondly, based on the selection
result, we learn the target cost function and trajectory distribution
simultaneously via enhanced Adversarial MaxEnt IRL and generate more
trajectories by the learned target distribution for the next preference
selection. The theoretical analysis on the distribution learning and
convergence of the proposed algorithm are provided. Extensive simulations on
several benchmarks have been conducted for further verifying the effectiveness
of the proposed method.Comment: Accepted to AAAI 2019. Mingxuan Jing and Xiaojian Ma contributed
equally to this wor
Subequivariant Graph Reinforcement Learning in 3D Environments
Learning a shared policy that guides the locomotion of different agents is of
core interest in Reinforcement Learning (RL), which leads to the study of
morphology-agnostic RL. However, existing benchmarks are highly restrictive in
the choice of starting point and target point, constraining the movement of the
agents within 2D space. In this work, we propose a novel setup for
morphology-agnostic RL, dubbed Subequivariant Graph RL in 3D environments
(3D-SGRL). Specifically, we first introduce a new set of more practical yet
challenging benchmarks in 3D space that allows the agent to have full
Degree-of-Freedoms to explore in arbitrary directions starting from arbitrary
configurations. Moreover, to optimize the policy over the enlarged state-action
space, we propose to inject geometric symmetry, i.e., subequivariance, into the
modeling of the policy and Q-function such that the policy can generalize to
all directions, improving exploration efficiency. This goal is achieved by a
novel SubEquivariant Transformer (SET) that permits expressive message
exchange. Finally, we evaluate the proposed method on the proposed benchmarks,
where our method consistently and significantly outperforms existing approaches
on single-task, multi-task, and zero-shot generalization scenarios. Extensive
ablations are also conducted to verify our design. Code and videos are
available on our project page: https://alpc91.github.io/SGRL/.Comment: ICML 2023 Ora
A Hilbert-Type Integral Inequality with Multiparameters and a Nonhomogeneous Kernel
We first introduce Γ-function and Riemann ζ-function to characterize the constant factor jointly. A Hilbert-type integral inequality with multiparameters and a nonhomogeneous kernel is given using the way of weight function and the technique of real analysis. The equivalent form is considered and its constant factors are proved to be the best possible. Some meaningful results are obtained by taking the special parameter values
Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction
Multimodal fusion and multitask learning are two vital topics in machine
learning. Despite the fruitful progress, existing methods for both problems are
still brittle to the same challenge -- it remains dilemmatic to integrate the
common information across modalities (resp. tasks) meanwhile preserving the
specific patterns of each modality (resp. task). Besides, while they are
actually closely related to each other, multimodal fusion and multitask
learning are rarely explored within the same methodological framework before.
In this paper, we propose Channel-Exchanging-Network (CEN) which is
self-adaptive, parameter-free, and more importantly, applicable for both
multimodal fusion and multitask learning. At its core, CEN dynamically
exchanges channels between subnetworks of different modalities. Specifically,
the channel exchanging process is self-guided by individual channel importance
that is measured by the magnitude of Batch-Normalization (BN) scaling factor
during training. For the application of dense image prediction, the validity of
CEN is tested by four different scenarios: multimodal fusion, cycle multimodal
fusion, multitask learning, and multimodal multitask learning. Extensive
experiments on semantic segmentation via RGB-D data and image translation
through multi-domain input verify the effectiveness of our CEN compared to
current state-of-the-art methods. Detailed ablation studies have also been
carried out, which provably affirm the advantage of each component we propose.Comment: 18 pages. arXiv admin note: substantial text overlap with
arXiv:2011.0500
Tackling Over-Smoothing for General Graph Convolutional Networks
Increasing the depth of GCN, which is expected to permit more expressivity,
is shown to incur performance detriment especially on node classification. The
main cause of this lies in over-smoothing. The over-smoothing issue drives the
output of GCN towards a space that contains limited distinguished information
among nodes, leading to poor expressivity. Several works on refining the
architecture of deep GCN have been proposed, but it is still unknown in theory
whether or not these refinements are able to relieve over-smoothing. In this
paper, we first theoretically analyze how general GCNs act with the increase in
depth, including generic GCN, GCN with bias, ResGCN, and APPNP. We find that
all these models are characterized by a universal process: all nodes converging
to a cuboid. Upon this theorem, we propose DropEdge to alleviate over-smoothing
by randomly removing a certain number of edges at each training epoch.
Theoretically, DropEdge either reduces the convergence speed of over-smoothing
or relieves the information loss caused by dimension collapse. Experimental
evaluations on simulated dataset have visualized the difference in
over-smoothing between different GCNs. Moreover, extensive experiments on
several real benchmarks support that DropEdge consistently improves the
performance on a variety of both shallow and deep GCNs.Comment: Submitted to TPAMI, 15 page
- …