32,398 research outputs found

    Ocular-based automatic summarization of documents: is re-reading informative about the importance of a sentence?

    Get PDF
    Automatic document summarization (ADS) has been introduced as a viable solution for reducing the time and the effort needed to read the ever-increasing textual content that is disseminated. However, a successful universal ADS algorithm has not yet been developed. Also, despite progress in the field, many ADS techniques do not take into account the needs of different readers, providing a summary without internal consistency and the consequent need to re-read the original document. The present study was aimed at investigating the usefulness of using eye tracking for increasing the quality of ADS. The general idea was of that of finding ocular behavioural indicators that could be easily implemented in ADS algorithms. For instance, the time spent in re-reading a sentence might reflect the relative importance of that sentence, thus providing a hint for the selection of text contributing to the summary. We have tested this hypothesis by comparing metrics based on the analysis of eye movements of 30 readers with the highlights they made afterward. Results showed that the time spent reading a sentence was not significantly related to its subjective value, thus frustrating our attempt. Results also showed that the length of a sentence is an unavoidable confounding because longer sentences have both the highest probability of containing units of text judged as important, and receive more fixations and re-fixations

    Effect of screen presentation on text reading and revising. International Journal of Human-Computer Studies

    Get PDF
    Two studies using the methods of experimental psychology assessed the effects of two types of text presentation (page-by-page vs. scrolling) on participants' performance while reading and revising texts. Greater facilitative effects of the page-by-page presentation were observed in both tasks. The participants' reading task performance indicated that they built a better mental representation of the text as a whole and were better at locating relevant information and remembering the main ideas. Their revising task performance indicated a larger number of global corrections (which are the most difficult to make)

    Entropy and Graph Based Modelling of Document Coherence using Discourse Entities: An Application

    Full text link
    We present two novel models of document coherence and their application to information retrieval (IR). Both models approximate document coherence using discourse entities, e.g. the subject or object of a sentence. Our first model views text as a Markov process generating sequences of discourse entities (entity n-grams); we use the entropy of these entity n-grams to approximate the rate at which new information appears in text, reasoning that as more new words appear, the topic increasingly drifts and text coherence decreases. Our second model extends the work of Guinaudeau & Strube [28] that represents text as a graph of discourse entities, linked by different relations, such as their distance or adjacency in text. We use several graph topology metrics to approximate different aspects of the discourse flow that can indicate coherence, such as the average clustering or betweenness of discourse entities in text. Experiments with several instantiations of these models show that: (i) our models perform on a par with two other well-known models of text coherence even without any parameter tuning, and (ii) reranking retrieval results according to their coherence scores gives notable performance gains, confirming a relation between document coherence and relevance. This work contributes two novel models of document coherence, the application of which to IR complements recent work in the integration of document cohesiveness or comprehensibility to ranking [5, 56]
    • …
    corecore