60 research outputs found
Machine Learning of Generic and User-Focused Summarization
A key problem in text summarization is finding a salience function which
determines what information in the source should be included in the summary.
This paper describes the use of machine learning on a training corpus of
documents and their abstracts to discover salience functions which describe
what combination of features is optimal for a given summarization task. The
method addresses both "generic" and user-focused summaries.Comment: In Proceedings of the Fifteenth National Conference on AI (AAAI-98),
p. 821-82
Question-answering, relevance feedback and summarisation : TREC-9 interactive track report
In this paper we report on the effectiveness of query-biased summaries for a question-answering task. Our summarisation system presents searchers with short summaries of documents, composed of a series of highly matching sentences extracted from the documents. These summaries are also used as evidence for a query expansion algorithm to test the use of summaries as evidence for interactive and automatic query expansion
A study on the use of summaries and summary-based query expansion for a question-answering task
In this paper we report an initial study on the effectiveness of query-biased summaries for a question answering task. Our summarisation system presents searchers with short summaries of documents. The summaries are composed of a set of sentences that highlight the main points of the document as they relate to the query. These summaries are also used as evidence for a query expansion algorithm to test the use of summaries as evidence for interactive and automatic query expansion. We present the results of a set of experiments to test these two approaches and discuss the relative success of these techniques
Automatic Generation of Titles for a Corpus of Questions
This paper describes the followed methodology to automatically generate titles for a corpus of
questions that belong to sociological opinion polls. Titles for questions have a twofold function: (1) they are the
input of user searches and (2) they inform about the whole contents of the question and possible answer options.
Thus, generation of titles can be considered as a case of automatic summarization. However, the fact that
summarization had to be performed over very short texts together with the aforementioned quality conditions
imposed on new generated titles led the authors to follow knowledge-rich and domain-dependent strategies for
summarization, disregarding the more frequent extractive techniques for summarization
The Effect of the Multi-Layer Text Summarization Model on the Efficiency and Relevancy of the Vector Space-based Information Retrieval
The massive upload of text on the internet creates a huge inverted index in
information retrieval systems, which hurts their efficiency. The purpose of
this research is to measure the effect of the Multi-Layer Similarity model of
the automatic text summarization on building an informative and condensed
invert index in the IR systems. To achieve this purpose, we summarized a
considerable number of documents using the Multi-Layer Similarity model, and we
built the inverted index from the automatic summaries that were generated from
this model. A series of experiments were held to test the performance in terms
of efficiency and relevancy. The experiments include comparisons with three
existing text summarization models; the Jaccard Coefficient Model, the Vector
Space Model, and the Latent Semantic Analysis model. The experiments examined
three groups of queries with manual and automatic relevancy assessment. The
positive effect of the Multi-Layer Similarity in the efficiency of the IR
system was clear without noticeable loss in the relevancy results. However, the
evaluation showed that the traditional statistical models without semantic
investigation failed to improve the information retrieval efficiency. Comparing
with the previous publications that addressed the use of summaries as a source
of the index, the relevancy assessment of our work was higher, and the
Multi-Layer Similarity retrieval constructed an inverted index that was 58%
smaller than the main corpus inverted index
- …