4,213 research outputs found
Progressively Dual Prior Guided Few-shot Semantic Segmentation
Few-shot semantic segmentation task aims at performing segmentation in query
images with a few annotated support samples. Currently, few-shot segmentation
methods mainly focus on leveraging foreground information without fully
utilizing the rich background information, which could result in wrong
activation of foreground-like background regions with the inadaptability to
dramatic scene changes of support-query image pairs. Meanwhile, the lack of
detail mining mechanism could cause coarse parsing results without some
semantic components or edge areas since prototypes have limited ability to cope
with large object appearance variance. To tackle these problems, we propose a
progressively dual prior guided few-shot semantic segmentation network.
Specifically, a dual prior mask generation (DPMG) module is firstly designed to
suppress the wrong activation in foreground-background comparison manner by
regarding background as assisted refinement information. With dual prior masks
refining the location of foreground area, we further propose a progressive
semantic detail enrichment (PSDE) module which forces the parsing model to
capture the hidden semantic details by iteratively erasing the high-confidence
foreground region and activating details in the rest region with a hierarchical
structure. The collaboration of DPMG and PSDE formulates a novel few-shot
segmentation network that can be learned in an end-to-end manner. Comprehensive
experiments on PASCAL-5i and MS COCO powerfully demonstrate that our proposed
algorithm achieves the great performance
Object-Oriented Dynamics Learning through Multi-Level Abstraction
Object-based approaches for learning action-conditioned dynamics has
demonstrated promise for generalization and interpretability. However, existing
approaches suffer from structural limitations and optimization difficulties for
common environments with multiple dynamic objects. In this paper, we present a
novel self-supervised learning framework, called Multi-level Abstraction
Object-oriented Predictor (MAOP), which employs a three-level learning
architecture that enables efficient object-based dynamics learning from raw
visual observations. We also design a spatial-temporal relational reasoning
mechanism for MAOP to support instance-level dynamics learning and handle
partial observability. Our results show that MAOP significantly outperforms
previous methods in terms of sample efficiency and generalization over novel
environments for learning environment models. We also demonstrate that learned
dynamics models enable efficient planning in unseen environments, comparable to
true environment models. In addition, MAOP learns semantically and visually
interpretable disentangled representations.Comment: Accepted to the Thirthy-Fourth AAAI Conference On Artificial
Intelligence (AAAI), 202
Attribute-Graph: A Graph based approach to Image Ranking
We propose a novel image representation, termed Attribute-Graph, to rank
images by their semantic similarity to a given query image. An Attribute-Graph
is an undirected fully connected graph, incorporating both local and global
image characteristics. The graph nodes characterise objects as well as the
overall scene context using mid-level semantic attributes, while the edges
capture the object topology. We demonstrate the effectiveness of
Attribute-Graphs by applying them to the problem of image ranking. We benchmark
the performance of our algorithm on the 'rPascal' and 'rImageNet' datasets,
which we have created in order to evaluate the ranking performance on complex
queries containing multiple objects. Our experimental evaluation shows that
modelling images as Attribute-Graphs results in improved ranking performance
over existing techniques.Comment: In IEEE International Conference on Computer Vision (ICCV) 201
- …