145 research outputs found

    Supplementing Missing Visions via Dialog for Scene Graph Generations

    Full text link
    Most current AI systems rely on the premise that the input visual data are sufficient to achieve competitive performance in various computer vision tasks. However, the classic task setup rarely considers the challenging, yet common practical situations where the complete visual data may be inaccessible due to various reasons (e.g., restricted view range and occlusions). To this end, we investigate a computer vision task setting with incomplete visual input data. Specifically, we exploit the Scene Graph Generation (SGG) task with various levels of visual data missingness as input. While insufficient visual input intuitively leads to performance drop, we propose to supplement the missing visions via the natural language dialog interactions to better accomplish the task objective. We design a model-agnostic Supplementary Interactive Dialog (SI-Dial) framework that can be jointly learned with most existing models, endowing the current AI systems with the ability of question-answer interactions in natural language. We demonstrate the feasibility of such a task setting with missing visual input and the effectiveness of our proposed dialog module as the supplementary information source through extensive experiments and analysis, by achieving promising performance improvement over multiple baselines.Comment: ICASSP 202

    Towards Robust Video Instance Segmentation with Temporal-Aware Transformer

    Full text link
    Most existing transformer based video instance segmentation methods extract per frame features independently, hence it is challenging to solve the appearance deformation problem. In this paper, we observe the temporal information is important as well and we propose TAFormer to aggregate spatio-temporal features both in transformer encoder and decoder. Specifically, in transformer encoder, we propose a novel spatio-temporal joint multi-scale deformable attention module which dynamically integrates the spatial and temporal information to obtain enriched spatio-temporal features. In transformer decoder, we introduce a temporal self-attention module to enhance the frame level box queries with the temporal relation. Moreover, TAFormer adopts an instance level contrastive loss to increase the discriminability of instance query embeddings. Therefore the tracking error caused by visually similar instances can be decreased. Experimental results show that TAFormer effectively leverages the spatial and temporal information to obtain context-aware feature representation and outperforms state-of-the-art methods

    Optimization-Based Motion Planning for Autonomous Agricultural Vehicles Turning in Constrained Headlands

    Full text link
    Headland maneuvering is a crucial aspect of unmanned field operations for autonomous agricultural vehicles (AAVs). While motion planning for headland turning in open fields has been extensively studied and integrated into commercial auto-guidance systems, the existing methods primarily address scenarios with ample headland space and thus may not work in more constrained headland geometries. Commercial orchards often contain narrow and irregularly shaped headlands, which may include static obstacles,rendering the task of planning a smooth and collision-free turning trajectory difficult. To address this challenge, we propose an optimization-based motion planning algorithm for headland turning under geometrical constraints imposed by field geometry and obstacles

    A Study of the Merger History of the Galaxy Group HCG 62 Based on X-Ray Observations and SPH Simulations

    Full text link
    We choose the bright compact group HCG 62, which was found to exhibit both excess X-ray emission and high Fe abundance to the southwest of its core, as an example to study the impact of mergers on chemical enrichment in the intragroup medium. We first reanalyze the high-quality Chandra and XMM-Newton archive data to search for the evidence for additional SN II yields, which is expected as a direct result of the possible merger-induced starburst. We reveal that, similar to the Fe abundance, the Mg abundance also shows a high value in both the innermost region and the southwest substructure, forming a high-abundance plateau, meanwhile all the SN Ia and SN II yields show rather flat distributions in >0.1r200>0.1r_{200} in favor of an early enrichment. Then we carry out a series of idealized numerical simulations to model the collision of two initially isolated galaxy groups by using the TreePM-SPH GADGET-3 code. We find that the observed X-ray emission and metal distributions, as well as the relative positions of the two bright central galaxies with reference to the X-ray peak, can be well reproduced in a major merger with a mass ratio of 3 when the merger-induced starburst is assumed. The `best-match' snapshot is pinpointed after the third pericentric passage when the southwest substructure is formed due to gas sloshing. By following the evolution of the simulated merging system, we conclude that the effects of such a major merger on chemical enrichment are mostly restricted within the core region when the final relaxed state is reached.Comment: Accepted for publication in the Astrophysical Journa
    • …
    corecore