32,398 research outputs found
Ocular-based automatic summarization of documents: is re-reading informative about the importance of a sentence?
Automatic document summarization (ADS) has been introduced as a viable solution for reducing the time and the effort needed to read the ever-increasing textual content that is disseminated. However, a successful universal ADS algorithm has not yet been developed. Also, despite progress in the field, many ADS techniques do not take into account the needs of different readers, providing a summary without internal consistency and the consequent need to re-read the original document. The present study was aimed at investigating the usefulness of using eye tracking for increasing the quality of ADS. The general idea was of that of finding ocular behavioural indicators that could be easily implemented in ADS algorithms. For instance, the time spent in re-reading a sentence might reflect the relative importance of that sentence, thus providing a hint for the selection of text contributing to the summary. We have tested this hypothesis by comparing metrics based on the analysis of eye movements of 30 readers with the highlights they made afterward. Results showed that the time spent reading a sentence was not significantly related to its subjective value, thus frustrating our attempt. Results also showed that the length of a sentence is an unavoidable confounding because longer sentences have both the highest probability of containing units of text judged as important, and receive more fixations and re-fixations
Effect of screen presentation on text reading and revising. International Journal of Human-Computer Studies
Two studies using the methods of experimental psychology assessed the effects of two types of text presentation (page-by-page vs. scrolling) on participants' performance while reading and revising texts. Greater facilitative effects of the page-by-page presentation were observed in both tasks. The participants' reading task performance indicated that they built a better mental representation of the text as a whole and were better at locating relevant information and remembering the main ideas. Their revising task performance indicated a larger number of global corrections (which are the most difficult to make)
Entropy and Graph Based Modelling of Document Coherence using Discourse Entities: An Application
We present two novel models of document coherence and their application to
information retrieval (IR). Both models approximate document coherence using
discourse entities, e.g. the subject or object of a sentence. Our first model
views text as a Markov process generating sequences of discourse entities
(entity n-grams); we use the entropy of these entity n-grams to approximate the
rate at which new information appears in text, reasoning that as more new words
appear, the topic increasingly drifts and text coherence decreases. Our second
model extends the work of Guinaudeau & Strube [28] that represents text as a
graph of discourse entities, linked by different relations, such as their
distance or adjacency in text. We use several graph topology metrics to
approximate different aspects of the discourse flow that can indicate
coherence, such as the average clustering or betweenness of discourse entities
in text. Experiments with several instantiations of these models show that: (i)
our models perform on a par with two other well-known models of text coherence
even without any parameter tuning, and (ii) reranking retrieval results
according to their coherence scores gives notable performance gains, confirming
a relation between document coherence and relevance. This work contributes two
novel models of document coherence, the application of which to IR complements
recent work in the integration of document cohesiveness or comprehensibility to
ranking [5, 56]
- …