Search CORE

1,663 research outputs found

Auto-Encoding Scene Graphs for Image Captioning

Author: Cai Jianfei
Tang Kaihua
Yang Xu
Zhang Hanwang
Publication venue
Publication date: 10/12/2018
Field of study

We propose Scene Graph Auto-Encoder (SGAE) that incorporates the language inductive bias into the encoder-decoder image captioning framework for more human-like captions. Intuitively, we humans use the inductive bias to compose collocations and contextual inference in discourse. For example, when we see the relation `person on bike', it is natural to replace `on' with `ride' and infer `person riding bike on a road' even the `road' is not evident. Therefore, exploiting such bias as a language prior is expected to help the conventional encoder-decoder models less likely overfit to the dataset bias and focus on reasoning. Specifically, we use the scene graph --- a directed graph (

\mathcal{G}

) where an object node is connected by adjective nodes and relationship nodes --- to represent the complex structural layout of both image (

\mathcal{I}

) and sentence (

\mathcal{S}

). In the textual domain, we use SGAE to learn a dictionary (

\mathcal{D}

) that helps to reconstruct sentences in the

\mathcal{S}\rightarrow \mathcal{G} \rightarrow \mathcal{D} \rightarrow \mathcal{S}

pipeline, where

\mathcal{D}

encodes the desired language prior; in the vision-language domain, we use the shared

\mathcal{D}

to guide the encoder-decoder in the

\mathcal{I}\rightarrow \mathcal{G}\rightarrow \mathcal{D} \rightarrow \mathcal{S}

pipeline. Thanks to the scene graph representation and shared dictionary, the inductive bias is transferred across domains in principle. We validate the effectiveness of SGAE on the challenging MS-COCO image captioning benchmark, e.g., our SGAE-based single-model achieves a new state-of-the-art

127.8

CIDEr-D on the Karpathy split, and a competitive

125.5

CIDEr-D (c40) on the official server even compared to other ensemble models

arXiv.org e-Print Archive

Domain-adaptive Message Passing Graph Neural Network

Author: Choi Kup-Sze
Pan Shirui
Shen Xiao
Zhou Xi
Publication venue
Publication date: 31/08/2023
Field of study

Cross-network node classification (CNNC), which aims to classify nodes in a label-deficient target network by transferring the knowledge from a source network with abundant labels, draws increasing attention recently. To address CNNC, we propose a domain-adaptive message passing graph neural network (DM-GNN), which integrates graph neural network (GNN) with conditional adversarial domain adaptation. DM-GNN is capable of learning informative representations for node classification that are also transferrable across networks. Firstly, a GNN encoder is constructed by dual feature extractors to separate ego-embedding learning from neighbor-embedding learning so as to jointly capture commonality and discrimination between connected nodes. Secondly, a label propagation node classifier is proposed to refine each node's label prediction by combining its own prediction and its neighbors' prediction. In addition, a label-aware propagation scheme is devised for the labeled source network to promote intra-class propagation while avoiding inter-class propagation, thus yielding label-discriminative source embeddings. Thirdly, conditional adversarial domain adaptation is performed to take the neighborhood-refined class-label information into account during adversarial domain adaptation, so that the class-conditional distributions across networks can be better matched. Comparisons with eleven state-of-the-art methods demonstrate the effectiveness of the proposed DM-GNN

arXiv.org e-Print Archive