162 research outputs found
Learning to Extract Coherent Summary via Deep Reinforcement Learning
Coherence plays a critical role in producing a high-quality summary from a
document. In recent years, neural extractive summarization is becoming
increasingly attractive. However, most of them ignore the coherence of
summaries when extracting sentences. As an effort towards extracting coherent
summaries, we propose a neural coherence model to capture the cross-sentence
semantic and syntactic coherence patterns. The proposed neural coherence model
obviates the need for feature engineering and can be trained in an end-to-end
fashion using unlabeled data. Empirical results show that the proposed neural
coherence model can efficiently capture the cross-sentence coherence patterns.
Using the combined output of the neural coherence model and ROUGE package as
the reward, we design a reinforcement learning method to train a proposed
neural extractive summarizer which is named Reinforced Neural Extractive
Summarization (RNES) model. The RNES model learns to optimize coherence and
informative importance of the summary simultaneously. Experimental results show
that the proposed RNES outperforms existing baselines and achieves
state-of-the-art performance in term of ROUGE on CNN/Daily Mail dataset. The
qualitative evaluation indicates that summaries produced by RNES are more
coherent and readable.Comment: 8 pages, 1 figure, presented at AAAI-201
LCSTS: A Large Scale Chinese Short Text Summarization Dataset
Automatic text summarization is widely regarded as the highly difficult
problem, partially because of the lack of large text summarization data set.
Due to the great challenge of constructing the large scale summaries for full
text, in this paper, we introduce a large corpus of Chinese short text
summarization dataset constructed from the Chinese microblogging website Sina
Weibo, which is released to the public
{http://icrc.hitsz.edu.cn/Article/show/139.html}. This corpus consists of over
2 million real Chinese short texts with short summaries given by the author of
each text. We also manually tagged the relevance of 10,666 short summaries with
their corresponding short texts. Based on the corpus, we introduce recurrent
neural network for the summary generation and achieve promising results, which
not only shows the usefulness of the proposed corpus for short text
summarization research, but also provides a baseline for further research on
this topic.Comment: Recently, we received feedbacks from Yuya Taguchi from NAIST in Japan
and Qian Chen from USTC of China, that the results in the EMNLP2015 version
seem to be underrated. So we carefully checked our results and find out that
we made a mistake while using the standard ROUGE. Then we re-evaluate all
methods in the paper and get corrected results listed in Table 2 of this
versio
First-principles LDA+U and GGA+U study of neptunium dioxide
We have performed a systematic first-principles investigation to calculate
the electronic structures, mechanical properties, and phonon dispersion curves
of NpO. The local density approximation and the generalized gradient
approximation formalisms have been used to account for the strong on-site
Coulomb repulsion among the localized Np electrons. By choosing the
Hubbard \emph{U} parameter around 4 eV, the orbital occupancy characters of Np
5\emph{f} and O 2\emph{p} are in good agreement with recent experiments [J.
Nucl. Mater. \textbf{389}, 470 (2009)]. Comparing with our previous study of
ThO, we note that stronger covalency exists in NpO due to the more
localization behavior of 5\emph{f} electrons of Np in line with the
localization-delocalization trend exhibited by the actinides series.Comment: 7 pages, 6 figure
Answer Sequence Learning with Neural Networks for Answer Selection in Community Question Answering
In this paper, the answer selection problem in community question answering
(CQA) is regarded as an answer sequence labeling task, and a novel approach is
proposed based on the recurrent architecture for this problem. Our approach
applies convolution neural networks (CNNs) to learning the joint representation
of question-answer pair firstly, and then uses the joint representation as
input of the long short-term memory (LSTM) to learn the answer sequence of a
question for labeling the matching quality of each answer. Experiments
conducted on the SemEval 2015 CQA dataset shows the effectiveness of our
approach.Comment: 6 page
Prompt-based Text Entailment for Low-Resource Named Entity Recognition
Pre-trained Language Models (PLMs) have been applied in NLP tasks and achieve
promising results. Nevertheless, the fine-tuning procedure needs labeled data
of the target domain, making it difficult to learn in low-resource and
non-trivial labeled scenarios. To address these challenges, we propose
Prompt-based Text Entailment (PTE) for low-resource named entity recognition,
which better leverages knowledge in the PLMs. We first reformulate named entity
recognition as the text entailment task. The original sentence with entity
type-specific prompts is fed into PLMs to get entailment scores for each
candidate. The entity type with the top score is then selected as final label.
Then, we inject tagging labels into prompts and treat words as basic units
instead of n-gram spans to reduce time complexity in generating candidates by
n-grams enumeration. Experimental results demonstrate that the proposed method
PTE achieves competitive performance on the CoNLL03 dataset, and better than
fine-tuned counterparts on the MIT Movie and Few-NERD dataset in low-resource
settings.Comment: COLING 202
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates
Calibration strengthens the trustworthiness of black-box models by producing
better accurate confidence estimates on given examples. However, little is
known about if model explanations can help confidence calibration. Intuitively,
humans look at important features attributions and decide whether the model is
trustworthy. Similarly, the explanations can tell us when the model may or may
not know. Inspired by this, we propose a method named CME that leverages model
explanations to make the model less confident with non-inductive attributions.
The idea is that when the model is not highly confident, it is difficult to
identify strong indications of any class, and the tokens accordingly do not
have high attribution scores for any class and vice versa. We conduct extensive
experiments on six datasets with two popular pre-trained language models in the
in-domain and out-of-domain settings. The results show that CME improves
calibration performance in all settings. The expected calibration errors are
further reduced when combined with temperature scaling. Our findings highlight
that model explanations can help calibrate posterior estimates.Comment: EMNLP 202
- …