38,822 research outputs found
Recommended from our members
Text readability and summarisation for non-native reading comprehension
This thesis focuses on two important aspects of non-native reading comprehension: text readability assessment, which estimates the reading difficulty of a given text for L2 learners, and learner summarisation assessment, which evaluates the quality of learner summaries to assess their reading comprehension. We approach both tasks as supervised machine learning problems and present automated assessment systems that achieve state-of-the-art performance.
We first address the task of text readability assessment for L2 learners. One of the major challenges for a data-driven approach to text readability assessment is the lack of significantly-sized level-annotated data aimed at L2 learners. We present a dataset of CEFR-graded texts tailored for L2 learners and look into a range of linguistic features affecting text readability. We compare the text readability measures for native and L2 learners and explore methods that make use of the more plentiful data aimed at native readers to help improve L2 readability assessment.
We then present a summarisation task for evaluating non-native reading comprehension and demonstrate an automated summarisation assessment system aimed at evaluating the quality of learner summaries. We propose three novel machine learning approaches to assessing learner summaries. In the first approach, we examine using several NLP techniques to extract features to measure the content similarity between the reading passage and the summary. In the second approach, we calculate a similarity matrix and apply a convolutional neural network (CNN) model to assess the summary quality using the similarity matrix. In the third approach, we build an end-to-end summarisation assessment model using recurrent neural networks (RNNs). Further, we combine the three approaches to a single system using a parallel ensemble modelling technique. We show that our models outperform traditional approaches that rely on exact word match on the task and that our best model produces quality assessments close to professional examiners
Generating indicative-informative summaries with SumUM
We present and evaluate SumUM, a text summarization system that takes a raw technical text as input and produces an indicative informative summary. The indicative part of the summary identifies the topics of the document, and the informative part elaborates on some of these topics according to the reader's interest. SumUM motivates the topics, describes entities, and defines concepts. It is a first step for exploring the issue of dynamic summarization. This is accomplished through a process of shallow syntactic and semantic analysis, concept identification, and text regeneration. Our method was developed through the study of a corpus of abstracts written by professional abstractors. Relying on human judgment, we have evaluated indicativeness, informativeness, and text acceptability of the automatic summaries. The results thus far indicate good performance when compared with other summarization technologies
Keyphrase Based Evaluation of Automatic Text Summarization
The development of methods to deal with the informative contents of the text
units in the matching process is a major challenge in automatic summary
evaluation systems that use fixed n-gram matching. The limitation causes
inaccurate matching between units in a peer and reference summaries. The
present study introduces a new Keyphrase based Summary Evaluator KpEval for
evaluating automatic summaries. The KpEval relies on the keyphrases since they
convey the most important concepts of a text. In the evaluation process, the
keyphrases are used in their lemma form as the matching text unit. The system
was applied to evaluate different summaries of Arabic multi-document data set
presented at TAC2011. The results showed that the new evaluation technique
correlates well with the known evaluation systems: Rouge1, Rouge2, RougeSU4,
and AutoSummENG MeMoG. KpEval has the strongest correlation with AutoSummENG
MeMoG, Pearson and spearman correlation coefficient measures are 0.8840, 0.9667
respectively.Comment: 4 pages, 1 figure, 3 table
An exploratory study into automated précis grading
Automated writing evaluation is a popular research field, but the main focus has been on evaluating argumentative essays. In this paper, we consider a different genre, namely précis texts. A précis is a written text that provides a coherent summary of main points of a spoken or written text. We present a corpus of English précis texts which all received a grade assigned by a highly-experienced English language teacher and were subsequently annotated following an exhaustive error typology. With this corpus we trained a machine learning model which relies on a number of linguistic, automatic summarization and AWE features. Our results reveal that this model is able to predict the grade of précis texts with only a moderate error margin
LCSTS: A Large Scale Chinese Short Text Summarization Dataset
Automatic text summarization is widely regarded as the highly difficult
problem, partially because of the lack of large text summarization data set.
Due to the great challenge of constructing the large scale summaries for full
text, in this paper, we introduce a large corpus of Chinese short text
summarization dataset constructed from the Chinese microblogging website Sina
Weibo, which is released to the public
{http://icrc.hitsz.edu.cn/Article/show/139.html}. This corpus consists of over
2 million real Chinese short texts with short summaries given by the author of
each text. We also manually tagged the relevance of 10,666 short summaries with
their corresponding short texts. Based on the corpus, we introduce recurrent
neural network for the summary generation and achieve promising results, which
not only shows the usefulness of the proposed corpus for short text
summarization research, but also provides a baseline for further research on
this topic.Comment: Recently, we received feedbacks from Yuya Taguchi from NAIST in Japan
and Qian Chen from USTC of China, that the results in the EMNLP2015 version
seem to be underrated. So we carefully checked our results and find out that
we made a mistake while using the standard ROUGE. Then we re-evaluate all
methods in the paper and get corrected results listed in Table 2 of this
versio
- …