32 research outputs found
Interpretable End-to-End Driving Model for Implicit Scene Understanding
Driving scene understanding is to obtain comprehensive scene information
through the sensor data and provide a basis for downstream tasks, which is
indispensable for the safety of self-driving vehicles. Specific perception
tasks, such as object detection and scene graph generation, are commonly used.
However, the results of these tasks are only equivalent to the characterization
of sampling from high-dimensional scene features, which are not sufficient to
represent the scenario. In addition, the goal of perception tasks is
inconsistent with human driving that just focuses on what may affect the
ego-trajectory. Therefore, we propose an end-to-end Interpretable Implicit
Driving Scene Understanding (II-DSU) model to extract implicit high-dimensional
scene features as scene understanding results guided by a planning module and
to validate the plausibility of scene understanding using auxiliary perception
tasks for visualization. Experimental results on CARLA benchmarks show that our
approach achieves the new state-of-the-art and is able to obtain scene features
that embody richer scene information relevant to driving, enabling superior
performance of the downstream planning.Comment: Accepted by 26th IEEE International Conference on Intelligent
Transportation Systems (ITSC 2023
Semantic Scene Graph Generation Based on an Edge Dual Scene Graph and Message Passing Neural Network
Along with generative AI, interest in scene graph generation (SGG), which
comprehensively captures the relationships and interactions between objects in
an image and creates a structured graph-based representation, has significantly
increased in recent years. However, relying on object-centric and dichotomous
relationships, existing SGG methods have a limited ability to accurately
predict detailed relationships. To solve these problems, a new approach to the
modeling multiobject relationships, called edge dual scene graph generation
(EdgeSGG), is proposed herein. EdgeSGG is based on a edge dual scene graph and
Dual Message Passing Neural Network (DualMPNN), which can capture rich
contextual interactions between unconstrained objects. To facilitate the
learning of edge dual scene graphs with a symmetric graph structure, the
proposed DualMPNN learns both object- and relation-centric features for more
accurately predicting relation-aware contexts and allows fine-grained
relational updates between objects. A comparative experiment with
state-of-the-art (SoTA) methods was conducted using two public datasets for SGG
operations and six metrics for three subtasks. Compared with SoTA approaches,
the proposed model exhibited substantial performance improvements across all
SGG subtasks. Furthermore, experiment on long-tail distributions revealed that
incorporating the relationships between objects effectively mitigates existing
long-tail problems
Knowledge Graph Transfer Network for Few-Shot Recognition
Few-shot learning aims to learn novel categories from very few samples given
some base categories with sufficient training samples. The main challenge of
this task is the novel categories are prone to dominated by color, texture,
shape of the object or background context (namely specificity), which are
distinct for the given few training samples but not common for the
corresponding categories (see Figure 1). Fortunately, we find that transferring
information of the correlated based categories can help learn the novel
concepts and thus avoid the novel concept being dominated by the specificity.
Besides, incorporating semantic correlations among different categories can
effectively regularize this information transfer. In this work, we represent
the semantic correlations in the form of structured knowledge graph and
integrate this graph into deep neural networks to promote few-shot learning by
a novel Knowledge Graph Transfer Network (KGTN). Specifically, by initializing
each node with the classifier weight of the corresponding category, a
propagation mechanism is learned to adaptively propagate node message through
the graph to explore node interaction and transfer classifier information of
the base categories to those of the novel ones. Extensive experiments on the
ImageNet dataset show significant performance improvement compared with current
leading competitors. Furthermore, we construct an ImageNet-6K dataset that
covers larger scale categories, i.e, 6,000 categories, and experiments on this
dataset further demonstrate the effectiveness of our proposed model.Comment: accepted by AAAI 2020 as oral pape