27 research outputs found
Object-Oriented Dynamics Learning through Multi-Level Abstraction
Object-based approaches for learning action-conditioned dynamics has
demonstrated promise for generalization and interpretability. However, existing
approaches suffer from structural limitations and optimization difficulties for
common environments with multiple dynamic objects. In this paper, we present a
novel self-supervised learning framework, called Multi-level Abstraction
Object-oriented Predictor (MAOP), which employs a three-level learning
architecture that enables efficient object-based dynamics learning from raw
visual observations. We also design a spatial-temporal relational reasoning
mechanism for MAOP to support instance-level dynamics learning and handle
partial observability. Our results show that MAOP significantly outperforms
previous methods in terms of sample efficiency and generalization over novel
environments for learning environment models. We also demonstrate that learned
dynamics models enable efficient planning in unseen environments, comparable to
true environment models. In addition, MAOP learns semantically and visually
interpretable disentangled representations.Comment: Accepted to the Thirthy-Fourth AAAI Conference On Artificial
Intelligence (AAAI), 202
Towards Robust Curve Text Detection with Conditional Spatial Expansion
It is challenging to detect curve texts due to their irregular shapes and
varying sizes. In this paper, we first investigate the deficiency of the
existing curve detection methods and then propose a novel Conditional Spatial
Expansion (CSE) mechanism to improve the performance of curve text detection.
Instead of regarding the curve text detection as a polygon regression or a
segmentation problem, we treat it as a region expansion process. Our CSE starts
with a seed arbitrarily initialized within a text region and progressively
merges neighborhood regions based on the extracted local features by a CNN and
contextual information of merged regions. The CSE is highly parameterized and
can be seamlessly integrated into existing object detection frameworks.
Enhanced by the data-dependent CSE mechanism, our curve text detection system
provides robust instance-level text region extraction with minimal
post-processing. The analysis experiment shows that our CSE can handle texts
with various shapes, sizes, and orientations, and can effectively suppress the
false-positives coming from text-like textures or unexpected texts included in
the same RoI. Compared with the existing curve text detection algorithms, our
method is more robust and enjoys a simpler processing flow. It also creates a
new state-of-art performance on curve text benchmarks with F-score of up to
78.4.Comment: This paper has been accepted by IEEE International Conference on
Computer Vision and Pattern Recognition (CVPR 2019
Pretraining in Deep Reinforcement Learning: A Survey
The past few years have seen rapid progress in combining reinforcement
learning (RL) with deep learning. Various breakthroughs ranging from games to
robotics have spurred the interest in designing sophisticated RL algorithms and
systems. However, the prevailing workflow in RL is to learn tabula rasa, which
may incur computational inefficiency. This precludes continuous deployment of
RL algorithms and potentially excludes researchers without large-scale
computing resources. In many other areas of machine learning, the pretraining
paradigm has shown to be effective in acquiring transferable knowledge, which
can be utilized for a variety of downstream tasks. Recently, we saw a surge of
interest in Pretraining for Deep RL with promising results. However, much of
the research has been based on different experimental settings. Due to the
nature of RL, pretraining in this field is faced with unique challenges and
hence requires new design principles. In this survey, we seek to systematically
review existing works in pretraining for deep reinforcement learning, provide a
taxonomy of these methods, discuss each sub-field, and bring attention to open
problems and future directions
Revisiting Discrete Soft Actor-Critic
We study the adaption of soft actor-critic (SAC) from continuous action space
to discrete action space. We revisit vanilla SAC and provide an in-depth
understanding of its Q value underestimation and performance instability issues
when applied to discrete settings. We thereby propose entropy-penalty and
double average Q-learning with Q-clip to address these issues. Extensive
experiments on typical benchmarks with discrete action space, including Atari
games and a large-scale MOBA game, show the efficacy of our proposed method.
Our code is at:https://github.com/coldsummerday/Revisiting-Discrete-SAC