6,267 research outputs found
Leveraging Graph to Improve Abstractive Multi-Document Summarization
Graphs that capture relations between textual units have great benefits for
detecting salient information from multiple documents and generating overall
coherent summaries. In this paper, we develop a neural abstractive
multi-document summarization (MDS) model which can leverage well-known graph
representations of documents such as similarity graph and discourse graph, to
more effectively process multiple input documents and produce abstractive
summaries. Our model utilizes graphs to encode documents in order to capture
cross-document relations, which is crucial to summarizing long documents. Our
model can also take advantage of graphs to guide the summary generation
process, which is beneficial for generating coherent and concise summaries.
Furthermore, pre-trained language models can be easily combined with our model,
which further improve the summarization performance significantly. Empirical
results on the WikiSum and MultiNews dataset show that the proposed
architecture brings substantial improvements over several strong baselines.Comment: Accepted by ACL202
Scientific document summarization via citation contextualization and scientific discourse
The rapid growth of scientific literature has made it difficult for the
researchers to quickly learn about the developments in their respective fields.
Scientific document summarization addresses this challenge by providing
summaries of the important contributions of scientific papers. We present a
framework for scientific summarization which takes advantage of the citations
and the scientific discourse structure. Citation texts often lack the evidence
and context to support the content of the cited paper and are even sometimes
inaccurate. We first address the problem of inaccuracy of the citation texts by
finding the relevant context from the cited paper. We propose three approaches
for contextualizing citations which are based on query reformulation, word
embeddings, and supervised learning. We then train a model to identify the
discourse facets for each citation. We finally propose a method for summarizing
scientific papers by leveraging the faceted citations and their corresponding
contexts. We evaluate our proposed method on two scientific summarization
datasets in the biomedical and computational linguistics domains. Extensive
evaluation results show that our methods can improve over the state of the art
by large margins.Comment: Preprint. The final publication is available at Springer via
http://dx.doi.org/10.1007/s00799-017-0216-8, International Journal on Digital
Libraries (IJDL) 201
State of the Art, Evaluation and Recommendations regarding "Document Processing and Visualization Techniques"
Several Networks of Excellence have been set up in the framework of the
European FP5 research program. Among these Networks of Excellence, the NEMIS
project focuses on the field of Text Mining.
Within this field, document processing and visualization was identified as
one of the key topics and the WG1 working group was created in the NEMIS
project, to carry out a detailed survey of techniques associated with the text
mining process and to identify the relevant research topics in related research
areas.
In this document we present the results of this comprehensive survey. The
report includes a description of the current state-of-the-art and practice, a
roadmap for follow-up research in the identified areas, and recommendations for
anticipated technological development in the domain of text mining.Comment: 54 pages, Report of Working Group 1 for the European Network of
Excellence (NoE) in Text Mining and its Applications in Statistics (NEMIS
Event Identification in Social Networks
Social networks enable users to freely communicate with each other and share
their recent news, ongoing activities or views about different topics. As a
result, they can be seen as a potentially viable source of information to
understand the current emerging topics/events. The ability to model emerging
topics is a substantial step to monitor and summarize the information
originating from social sources. Applying traditional methods for event
detection which are often proposed for processing large, formal and structured
documents, are less effective, due to the short length, noisiness and
informality of the social posts. Recent event detection techniques address
these challenges by exploiting the opportunities behind abundant information
available in social networks. This article provides an overview of the state of
the art in event detection from social networks.Comment: It will appear in Encyclopedia with Semantic Computing to be
published by World Scientifi
Hierarchical Transformers for Multi-Document Summarization
In this paper, we develop a neural summarization model which can effectively
process multiple input documents and distill Transformer architecture with the
ability to encode documents in a hierarchical manner. We represent
cross-document relationships via an attention mechanism which allows to share
information as opposed to simply concatenating text spans and processing them
as a flat sequence. Our model learns latent dependencies among textual units,
but can also take advantage of explicit graph representations focusing on
similarity or discourse relations. Empirical results on the WikiSum dataset
demonstrate that the proposed architecture brings substantial improvements over
several strong baselines.Comment: to appear at ACL 201
AI-Powered Text Generation for Harmonious Human-Machine Interaction: Current State and Future Directions
In the last two decades, the landscape of text generation has undergone
tremendous changes and is being reshaped by the success of deep learning. New
technologies for text generation ranging from template-based methods to neural
network-based methods emerged. Meanwhile, the research objectives have also
changed from generating smooth and coherent sentences to infusing personalized
traits to enrich the diversification of newly generated content. With the rapid
development of text generation solutions, one comprehensive survey is urgent to
summarize the achievements and track the state of the arts. In this survey
paper, we present the general systematical framework, illustrate the widely
utilized models and summarize the classic applications of text generation.Comment: Accepted by IEEE UIC 201
Iterative Document Representation Learning Towards Summarization with Polishing
In this paper, we introduce Iterative Text Summarization (ITS), an
iteration-based model for supervised extractive text summarization, inspired by
the observation that it is often necessary for a human to read an article
multiple times in order to fully understand and summarize its contents. Current
summarization approaches read through a document only once to generate a
document representation, resulting in a sub-optimal representation. To address
this issue we introduce a model which iteratively polishes the document
representation on many passes through the document. As part of our model, we
also introduce a selective reading mechanism that decides more accurately the
extent to which each sentence in the model should be updated. Experimental
results on the CNN/DailyMail and DUC2002 datasets demonstrate that our model
significantly outperforms state-of-the-art extractive systems when evaluated by
machines and by humans.Comment: 10 pages, 4 figures. emnlp 201
Soft Layer-Specific Multi-Task Summarization with Entailment and Question Generation
An accurate abstractive summary of a document should contain all its salient
information and should be logically entailed by the input document. We improve
these important aspects of abstractive summarization via multi-task learning
with the auxiliary tasks of question generation and entailment generation,
where the former teaches the summarization model how to look for salient
questioning-worthy details, and the latter teaches the model how to rewrite a
summary which is a directed-logical subset of the input document. We also
propose novel multi-task architectures with high-level (semantic)
layer-specific sharing across multiple encoder and decoder layers of the three
tasks, as well as soft-sharing mechanisms (and show performance ablations and
analysis examples of each contribution). Overall, we achieve statistically
significant improvements over the state-of-the-art on both the CNN/DailyMail
and Gigaword datasets, as well as on the DUC-2002 transfer setup. We also
present several quantitative and qualitative analysis studies of our model's
learned saliency and entailment skills.Comment: ACL 2018 (16 pages
Generating Fine-Grained Open Vocabulary Entity Type Descriptions
While large-scale knowledge graphs provide vast amounts of structured facts
about entities, a short textual description can often be useful to succinctly
characterize an entity and its type. Unfortunately, many knowledge graph
entities lack such textual descriptions. In this paper, we introduce a dynamic
memory-based network that generates a short open vocabulary description of an
entity by jointly leveraging induced fact embeddings as well as the dynamic
context of the generated sequence of words. We demonstrate the ability of our
architecture to discern relevant information for more accurate generation of
type description by pitting the system against several strong baselines.Comment: Published in ACL 201
Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News
Fake news are nowadays an issue of pressing concern, given their recent rise
as a potential threat to high-quality journalism and well-informed public
discourse. The Fake News Challenge (FNC-1) was organized in 2017 to encourage
the development of machine learning-based classification systems for stance
detection (i.e., for identifying whether a particular news article agrees,
disagrees, discusses, or is unrelated to a particular news headline), thus
helping in the detection and analysis of possible instances of fake news. This
article presents a new approach to tackle this stance detection problem, based
on the combination of string similarity features with a deep neural
architecture that leverages ideas previously advanced in the context of
learning efficient text representations, document classification, and natural
language inference. Specifically, we use bi-directional Recurrent Neural
Networks, together with max-pooling over the temporal/sequential dimension and
neural attention, for representing (i) the headline, (ii) the first two
sentences of the news article, and (iii) the entire news article. These
representations are then combined/compared, complemented with similarity
features inspired on other FNC-1 approaches, and passed to a final layer that
predicts the stance of the article towards the headline. We also explore the
use of external sources of information, specifically large datasets of sentence
pairs originally proposed for training and evaluating natural language
inference methods, in order to pre-train specific components of the neural
network architecture (e.g., the RNNs used for encoding sentences). The obtained
results attest to the effectiveness of the proposed ideas and show that our
model, particularly when considering pre-training and the combination of neural
representations together with similarity features, slightly outperforms the
previous state-of-the-art.Comment: Accepted for publication in the special issue of the ACM Journal of
Data and Information Quality (ACM JDIQ) on Combating Digital Misinformation
and Disinformatio
- …