6,465 research outputs found
Holistic, Instance-Level Human Parsing
Object parsing -- the task of decomposing an object into its semantic parts
-- has traditionally been formulated as a category-level segmentation problem.
Consequently, when there are multiple objects in an image, current methods
cannot count the number of objects in the scene, nor can they determine which
part belongs to which object. We address this problem by segmenting the parts
of objects at an instance-level, such that each pixel in the image is assigned
a part label, as well as the identity of the object it belongs to. Moreover, we
show how this approach benefits us in obtaining segmentations at coarser
granularities as well. Our proposed network is trained end-to-end given
detections, and begins with a category-level segmentation module. Thereafter, a
differentiable Conditional Random Field, defined over a variable number of
instances for every input image, reasons about the identity of each part by
associating it with a human detection. In contrast to other approaches, our
method can handle the varying number of people in each image and our holistic
network produces state-of-the-art results in instance-level part and human
segmentation, together with competitive results in category-level part
segmentation, all achieved by a single forward-pass through our neural network.Comment: Poster at BMVC 201
Learn to Interpret Atari Agents
Deep Reinforcement Learning (DeepRL) agents surpass human-level performances
in a multitude of tasks. However, the direct mapping from states to actions
makes it hard to interpret the rationale behind the decision making of agents.
In contrast to previous a-posteriori methods of visualizing DeepRL policies, we
propose an end-to-end trainable framework based on Rainbow, a representative
Deep Q-Network (DQN) agent. Our method automatically learns important regions
in the input domain, which enables characterizations of the decision making and
interpretations for non-intuitive behaviors. Hence we name it Region Sensitive
Rainbow (RS-Rainbow). RS-Rainbow utilizes a simple yet effective mechanism to
incorporate visualization ability into the learning model, not only improving
model interpretability, but leading to improved performance. Extensive
experiments on the challenging platform of Atari 2600 demonstrate the
superiority of RS-Rainbow. In particular, our agent achieves state of the art
at just 25% of the training frames. Demonstrations and code are available at
https://github.com/yz93/Learn-to-Interpret-Atari-Agents
Unifying Training and Inference for Panoptic Segmentation
We present an end-to-end network to bridge the gap between training and
inference pipeline for panoptic segmentation, a task that seeks to partition an
image into semantic regions for "stuff" and object instances for "things". In
contrast to recent works, our network exploits a parametrised, yet lightweight
panoptic segmentation submodule, powered by an end-to-end learnt dense instance
affinity, to capture the probability that any pair of pixels belong to the same
instance. This panoptic submodule gives rise to a novel propagation mechanism
for panoptic logits and enables the network to output a coherent panoptic
segmentation map for both "stuff" and "thing" classes, without any
post-processing. Reaping the benefits of end-to-end training, our full system
sets new records on the popular street scene dataset, Cityscapes, achieving
61.4 PQ with a ResNet-50 backbone using only the fine annotations. On the
challenging COCO dataset, our ResNet-50-based network also delivers
state-of-the-art accuracy of 43.4 PQ. Moreover, our network flexibly works with
and without object mask cues, performing competitively under both settings,
which is of interest for applications with computation budgets.Comment: CVPR 202
Local and non-local measures of acceleration in cosmology
Current cosmological observations, when interpreted within the framework of a
homogeneous and isotropic Friedmann-Lemaitre-Robertson-Walker (FLRW) model,
strongly suggest that the Universe is entering a period of accelerating
expansion. This is often taken to mean that the expansion of space itself is
accelerating. In a general spacetime, however, this is not necessarily true. We
attempt to clarify this point by considering a handful of local and non-local
measures of acceleration in a variety of inhomogeneous cosmological models.
Each of the chosen measures corresponds to a theoretical or observational
procedure that has previously been used to study acceleration in cosmology, and
all measures reduce to the same quantity in the limit of exact spatial
homogeneity and isotropy. In statistically homogeneous and isotropic
spacetimes, we find that the acceleration inferred from observations of the
distance-redshift relation is closely related to the acceleration of the
spatially averaged universe, but does not necessarily bear any resemblance to
the average of the local acceleration of spacetime itself. For inhomogeneous
spacetimes that do not display statistical homogeneity and isotropy, however,
we find little correlation between acceleration inferred from observations and
the acceleration of the averaged spacetime. This shows that observations made
in an inhomogeneous universe can imply acceleration without the existence of
dark energy.Comment: 19 pages, 10 figures. Several references added or amended, some minor
clarifications made in the tex
Dynamic Graph Message Passing Networks
Modelling long-range dependencies is critical for complex scene understanding
tasks such as semantic segmentation and object detection. Although CNNs have
excelled in many computer vision tasks, they are still limited in capturing
long-range structured relationships as they typically consist of layers of
local kernels. A fully-connected graph is beneficial for such modelling,
however, its computational overhead is prohibitive. We propose a dynamic graph
message passing network, based on the message passing neural network framework,
that significantly reduces the computational complexity compared to related
works modelling a fully-connected graph. This is achieved by adaptively
sampling nodes in the graph, conditioned on the input, for message passing.
Based on the sampled nodes, we then dynamically predict node-dependent filter
weights and the affinity matrix for propagating information between them. Using
this model, we show significant improvements with respect to strong,
state-of-the-art baselines on three different tasks and backbone architectures.
Our approach also outperforms fully-connected graphs while using substantially
fewer floating point operations and parameters.Comment: CVPR 2020 Ora
- …