9 research outputs found
Integrated Node Encoder for Labelled Textual Networks
Voluminous works have been implemented to exploit content-enhanced network
embedding models, with little focus on the labelled information of nodes.
Although TriDNR leverages node labels by treating them as node attributes, it
fails to enrich unlabelled node vectors with the labelled information, which
leads to the weaker classification result on the test set in comparison to
existing unsupervised textual network embedding models. In this study, we
design an integrated node encoder (INE) for textual networks which is jointly
trained on the structure-based and label-based objectives. As a result, the
node encoder preserves the integrated knowledge of not only the network text
and structure, but also the labelled information. Furthermore, INE allows the
creation of label-enhanced vectors for unlabelled nodes by entering their node
contents. Our node embedding achieves state-of-the-art performances in the
classification task on two public citation networks, namely Cora and DBLP,
pushing benchmarks up by 10.0\% and 12.1\%, respectively, with the 70\%
training ratio. Additionally, a feasible solution that generalizes our model
from textual networks to a broader range of networks is proposed.Comment: 7 page
Dialogue history integration into end-to-end signal-to-concept spoken language understanding systems
This work investigates the embeddings for representing dialog history in
spoken language understanding (SLU) systems. We focus on the scenario when the
semantic information is extracted directly from the speech signal by means of a
single end-to-end neural network model. We proposed to integrate dialogue
history into an end-to-end signal-to-concept SLU system. The dialog history is
represented in the form of dialog history embedding vectors (so-called
h-vectors) and is provided as an additional information to end-to-end SLU
models in order to improve the system performance. Three following types of
h-vectors are proposed and experimentally evaluated in this paper: (1)
supervised-all embeddings predicting bag-of-concepts expected in the answer of
the user from the last dialog system response; (2) supervised-freq embeddings
focusing on predicting only a selected set of semantic concept (corresponding
to the most frequent errors in our experiments); and (3) unsupervised
embeddings. Experiments on the MEDIA corpus for the semantic slot filling task
demonstrate that the proposed h-vectors improve the model performance.Comment: Accepted for ICASSP 2020 (Submitted: October 21, 2019
Unsupervised Abstractive Dialogue Summarization for Tete-a-Tetes
High-quality dialogue-summary paired data is expensive to produce and
domain-sensitive, making abstractive dialogue summarization a challenging task.
In this work, we propose the first unsupervised abstractive dialogue
summarization model for tete-a-tetes (SuTaT). Unlike standard text
summarization, a dialogue summarization method should consider the
multi-speaker scenario where the speakers have different roles, goals, and
language styles. In a tete-a-tete, such as a customer-agent conversation, SuTaT
aims to summarize for each speaker by modeling the customer utterances and the
agent utterances separately while retaining their correlations. SuTaT consists
of a conditional generative module and two unsupervised summarization modules.
The conditional generative module contains two encoders and two decoders in a
variational autoencoder framework where the dependencies between two latent
spaces are captured. With the same encoders and decoders, two unsupervised
summarization modules equipped with sentence-level self-attention mechanisms
generate summaries without using any annotations. Experimental results show
that SuTaT is superior on unsupervised dialogue summarization for both
automatic and human evaluations, and is capable of dialogue classification and
single-turn conversation generation