793 research outputs found
Faithful to the Original: Fact Aware Neural Abstractive Summarization
Unlike extractive summarization, abstractive summarization has to fuse
different parts of the source text, which inclines to create fake facts. Our
preliminary study reveals nearly 30% of the outputs from a state-of-the-art
neural summarization system suffer from this problem. While previous
abstractive summarization approaches usually focus on the improvement of
informativeness, we argue that faithfulness is also a vital prerequisite for a
practical abstractive summarization system. To avoid generating fake facts in a
summary, we leverage open information extraction and dependency parse
technologies to extract actual fact descriptions from the source text. The
dual-attention sequence-to-sequence framework is then proposed to force the
generation conditioned on both the source text and the extracted fact
descriptions. Experiments on the Gigaword benchmark dataset demonstrate that
our model can greatly reduce fake summaries by 80%. Notably, the fact
descriptions also bring significant improvement on informativeness since they
often condense the meaning of the source text.Comment: 8 pages, 3 figures, AAAI 201
TGSum: Build Tweet Guided Multi-Document Summarization Dataset
The development of summarization research has been significantly hampered by
the costly acquisition of reference summaries. This paper proposes an effective
way to automatically collect large scales of news-related multi-document
summaries with reference to social media's reactions. We utilize two types of
social labels in tweets, i.e., hashtags and hyper-links. Hashtags are used to
cluster documents into different topic sets. Also, a tweet with a hyper-link
often highlights certain key points of the corresponding document. We
synthesize a linked document cluster to form a reference summary which can
cover most key points. To this aim, we adopt the ROUGE metrics to measure the
coverage ratio, and develop an Integer Linear Programming solution to discover
the sentence set reaching the upper bound of ROUGE. Since we allow summary
sentences to be selected from both documents and high-quality tweets, the
generated reference summaries could be abstractive. Both informativeness and
readability of the collected summaries are verified by manual judgment. In
addition, we train a Support Vector Regression summarizer on DUC generic
multi-document summarization benchmarks. With the collected data as extra
training resource, the performance of the summarizer improves a lot on all the
test sets. We release this dataset for further research.Comment: 7 pages, 1 figure in AAAI 201
Inhomogeneous states with checkerboard order in the t-J Model
We study inhomogeneous states in the t-J model using an unrestricted
Gutzwiller approximation. We find that checkerboard order, where
is a doping dependent number, emerges from Fermi surface instabilities of
both the staggered flux phase and the Fermi liquid state with realistic band
parameters. In both cases, the checkerboard order develops at wave vectors
, that are tied to the peaks of the
wave-vector dependent susceptibility, and is of the Lomer-Rice-Scott type. The
properties of such periodic, inhomogeneous states are discussed in connection
to the checkerboard patterns observed by STM in underdoped cuprates.Comment: Published Versio
Multi-Document Summarization via Discriminative Summary Reranking
Existing multi-document summarization systems usually rely on a specific
summarization model (i.e., a summarization method with a specific parameter
setting) to extract summaries for different document sets with different
topics. However, according to our quantitative analysis, none of the existing
summarization models can always produce high-quality summaries for different
document sets, and even a summarization model with good overall performance may
produce low-quality summaries for some document sets. On the contrary, a
baseline summarization model may produce high-quality summaries for some
document sets. Based on the above observations, we treat the summaries produced
by different summarization models as candidate summaries, and then explore
discriminative reranking techniques to identify high-quality summaries from the
candidates for difference document sets. We propose to extract a set of
candidate summaries for each document set based on an ILP framework, and then
leverage Ranking SVM for summary reranking. Various useful features have been
developed for the reranking process, including word-level features,
sentence-level features and summary-level features. Evaluation results on the
benchmark DUC datasets validate the efficacy and robustness of our proposed
approach
Real is not True: Backdoor Attacks Against Deepfake Detection
The proliferation of malicious deepfake applications has ignited substantial
public apprehension, casting a shadow of doubt upon the integrity of digital
media. Despite the development of proficient deepfake detection mechanisms,
they persistently demonstrate pronounced vulnerability to an array of attacks.
It is noteworthy that the pre-existing repertoire of attacks predominantly
comprises adversarial example attack, predominantly manifesting during the
testing phase. In the present study, we introduce a pioneering paradigm
denominated as Bad-Deepfake, which represents a novel foray into the realm of
backdoor attacks levied against deepfake detectors. Our approach hinges upon
the strategic manipulation of a delimited subset of the training data, enabling
us to wield disproportionate influence over the operational characteristics of
a trained model. This manipulation leverages inherent frailties inherent to
deepfake detectors, affording us the capacity to engineer triggers and
judiciously select the most efficacious samples for the construction of the
poisoned set. Through the synergistic amalgamation of these sophisticated
techniques, we achieve an remarkable performance-a 100% attack success rate
(ASR) against extensively employed deepfake detectors.Comment: BigDIA 202
- …