2 research outputs found
An Annotation Scheme of A Large-scale Multi-party Dialogues Dataset for Discourse Parsing and Machine Comprehension
In this paper, we propose the scheme for annotating large-scale multi-party
chat dialogues for discourse parsing and machine comprehension. The main goal
of this project is to help understand multi-party dialogues. Our dataset is
based on the Ubuntu Chat Corpus. For each multi-party dialogue, we annotate the
discourse structure and question-answer pairs for dialogues. As we know, this
is the first large scale corpus for multi-party dialogues discourse parsing,
and we firstly propose the task for multi-party dialogues machine reading
comprehension
Improving Online Forums Summarization via Unifying Hierarchical Attention Networks with Convolutional Neural Networks
Online discussion forums are prevalent and easily accessible, thus allowing
people to share ideas and opinions by posting messages in the discussion
threads. Forum threads that significantly grow in length can become difficult
for participants, both newcomers and existing, to grasp main ideas. This study
aims to create an automatic text summarizer for online forums to mitigate this
problem. We present a framework based on hierarchical attention networks,
unifying Bidirectional Long Short-Term Memory (Bi-LSTM) and Convolutional
Neural Network (CNN) to build sentence and thread representations for the forum
summarization. In this scheme, Bi-LSTM derives a representation that comprises
information of the whole sentence and whole thread; whereas, CNN recognizes
high-level patterns of dominant units with respect to the sentence and thread
context. The attention mechanism is applied on top of CNN to further highlight
the high-level representations that capture any important units contributing to
a desirable summary. Extensive performance evaluation based on three datasets,
two of which are real-life online forums and one is news dataset, reveals that
the proposed model outperforms several competitive baselines.Comment: 27 pages, 7 figure