145 research outputs found
Supplementing Missing Visions via Dialog for Scene Graph Generations
Most current AI systems rely on the premise that the input visual data are
sufficient to achieve competitive performance in various computer vision tasks.
However, the classic task setup rarely considers the challenging, yet common
practical situations where the complete visual data may be inaccessible due to
various reasons (e.g., restricted view range and occlusions). To this end, we
investigate a computer vision task setting with incomplete visual input data.
Specifically, we exploit the Scene Graph Generation (SGG) task with various
levels of visual data missingness as input. While insufficient visual input
intuitively leads to performance drop, we propose to supplement the missing
visions via the natural language dialog interactions to better accomplish the
task objective. We design a model-agnostic Supplementary Interactive Dialog
(SI-Dial) framework that can be jointly learned with most existing models,
endowing the current AI systems with the ability of question-answer
interactions in natural language. We demonstrate the feasibility of such a task
setting with missing visual input and the effectiveness of our proposed dialog
module as the supplementary information source through extensive experiments
and analysis, by achieving promising performance improvement over multiple
baselines.Comment: ICASSP 202
Towards Robust Video Instance Segmentation with Temporal-Aware Transformer
Most existing transformer based video instance segmentation methods extract
per frame features independently, hence it is challenging to solve the
appearance deformation problem. In this paper, we observe the temporal
information is important as well and we propose TAFormer to aggregate
spatio-temporal features both in transformer encoder and decoder. Specifically,
in transformer encoder, we propose a novel spatio-temporal joint multi-scale
deformable attention module which dynamically integrates the spatial and
temporal information to obtain enriched spatio-temporal features. In
transformer decoder, we introduce a temporal self-attention module to enhance
the frame level box queries with the temporal relation. Moreover, TAFormer
adopts an instance level contrastive loss to increase the discriminability of
instance query embeddings. Therefore the tracking error caused by visually
similar instances can be decreased. Experimental results show that TAFormer
effectively leverages the spatial and temporal information to obtain
context-aware feature representation and outperforms state-of-the-art methods
Optimization-Based Motion Planning for Autonomous Agricultural Vehicles Turning in Constrained Headlands
Headland maneuvering is a crucial aspect of unmanned field operations for
autonomous agricultural vehicles (AAVs). While motion planning for headland
turning in open fields has been extensively studied and integrated into
commercial auto-guidance systems, the existing methods primarily address
scenarios with ample headland space and thus may not work in more constrained
headland geometries. Commercial orchards often contain narrow and irregularly
shaped headlands, which may include static obstacles,rendering the task of
planning a smooth and collision-free turning trajectory difficult. To address
this challenge, we propose an optimization-based motion planning algorithm for
headland turning under geometrical constraints imposed by field geometry and
obstacles
A Study of the Merger History of the Galaxy Group HCG 62 Based on X-Ray Observations and SPH Simulations
We choose the bright compact group HCG 62, which was found to exhibit both
excess X-ray emission and high Fe abundance to the southwest of its core, as an
example to study the impact of mergers on chemical enrichment in the intragroup
medium. We first reanalyze the high-quality Chandra and XMM-Newton archive data
to search for the evidence for additional SN II yields, which is expected as a
direct result of the possible merger-induced starburst. We reveal that, similar
to the Fe abundance, the Mg abundance also shows a high value in both the
innermost region and the southwest substructure, forming a high-abundance
plateau, meanwhile all the SN Ia and SN II yields show rather flat
distributions in in favor of an early enrichment. Then we carry
out a series of idealized numerical simulations to model the collision of two
initially isolated galaxy groups by using the TreePM-SPH GADGET-3 code. We find
that the observed X-ray emission and metal distributions, as well as the
relative positions of the two bright central galaxies with reference to the
X-ray peak, can be well reproduced in a major merger with a mass ratio of 3
when the merger-induced starburst is assumed. The `best-match' snapshot is
pinpointed after the third pericentric passage when the southwest substructure
is formed due to gas sloshing. By following the evolution of the simulated
merging system, we conclude that the effects of such a major merger on chemical
enrichment are mostly restricted within the core region when the final relaxed
state is reached.Comment: Accepted for publication in the Astrophysical Journa
- …