24 research outputs found

    Exploiting information needs and bibliographics for polyrepresentative document clustering

    Get PDF
    In this paper we explore the potential of combining the principle of polyrepresentation with document clustering. Our idea is discussed and evaluated for polyrepresentation of information needs as wells as for document-based polyrepresentation where bibliographic information is used as representation. The main idea is to present the user with the highly ranked polyrepresentative clusters to support the search process. Our evaluation suggests that our approach is capable of increasing retrieval performance, but performance varies for queries with a high or low number of relevant documents

    A probabilistic approach for cluster based polyrepresentative information retrieval

    Get PDF
    A thesis submitted to the University of Bedfordshire in partial ful lment of the requirements for the degree of Doctor of PhilosophyDocument clustering in information retrieval (IR) is considered an alternative to rank-based retrieval approaches, because of its potential to support user interactions beyond just typing in queries. Similarly, the Principle of Polyrepresentation (multi-evidence: combining multiple cognitively and/or functionally diff erent information need or information object representations for improving an IR system's performance) is an established approach in cognitive IR with plausible applicability in the domain of information seeking and retrieval. The combination of these two approaches can assimilate their respective individual strengths in order to further improve the performance of IR systems. The main goal of this study is to combine cognitive and cluster-based IR approaches for improving the eff ectiveness of (interactive) information retrieval systems. In order to achieve this goal, polyrepresentative information retrieval strategies for cluster browsing and retrieval have been designed, focusing on the evaluation aspect of such strategies. This thesis addresses the challenge of designing and evaluating an Optimum Clustering Framework (OCF) based model, implementing probabilistic document clustering for interactive IR. Thus, polyrepresentative cluster browsing strategies have been devised. With these strategies a simulated user based method has been adopted for evaluating the polyrepresentative cluster browsing and searching strategies. The proposed approaches are evaluated for information need based polyrepresentative clustering as well as document based polyrepresentation and the combination thereof. For document-based polyrepresentation, the notion of citation context is exploited, which has special applications in scientometrics and bibliometrics for science literature modelling. The information need polyrepresentation, on the other hand, utilizes the various aspects of user information need, which is crucial for enhancing the retrieval performance. Besides describing a probabilistic framework for polyrepresentative document clustering, one of the main fi ndings of this work is that the proposed combination of the Principle of Polyrepresentation with document clustering has the potential of enhancing the user interactions with an IR system, provided that the various representations of information need and information objects are utilized. The thesis also explores interactive IR approaches in the context of polyrepresentative interactive information retrieval when it is combined with document clustering methods. Experiments suggest there is a potential in the proposed cluster-based polyrepresentation approach, since statistically signifi cant improvements were found when comparing the approach to a BM25-based baseline in an ideal scenario. Further marginal improvements were observed when cluster-based re-ranking and cluster-ranking based comparisons were made. The performance of the approach depends on the underlying information object and information need representations used, which confi rms fi ndings of previous studies where the Principle of Polyrepresentation was applied in diff erent ways

    Polyrepresentative Clustering: A Study of Simulated User Strategies and Representations

    Get PDF
    Abstract. The principle of polyrepresentation and document clustering are two established methods for Interactive Information Retrieval, which have been used separately so far. In this paper we discuss a cluster based polyrepresentation approach for information need and document based representations. In our work we simulate and evaluate two possible cluster browsing strategies a user could apply to explore the polyrepresentative clusters. In our evaluation we apply information need and bibliographic features on the iSearch collection. Our results suggest that polyrepresentative cluster browsing may be more effective than exploring a ranked list

    Supporting polyrepresentation in a quantum-inspired geometrical retrieval framework

    Get PDF
    The relevance of a document has many facets, going beyond the usual topical one, which have to be considered to satisfy a user's information need. Multiple representations of documents, like user-given reviews or the actual document content, can give evidence towards certain facets of relevance. In this respect polyrepresentation of documents, where such evidence is combined, is a crucial concept to estimate the relevance of a document. In this paper, we discuss how a geometrical retrieval framework inspired by quantum mechanics can be extended to support polyrepresentation. We show by example how different representations of a document can be modelled in a Hilbert space, similar to physical systems known from quantum mechanics. We further illustrate how these representations are combined by means of the tensor product to support polyrepresentation, and discuss the case that representations of documents are not independent from a user point of view. Besides giving a principled framework for polyrepresentation, the potential of this approach is to capture and formalise the complex interdependent relationships that the different representations can have between each other

    The INEX 2010 Interactive Track: An Overview

    Get PDF
    In the paper we present the organization of the INEX 2010 interactive track. For the 2010 experiments the iTrack has gathered data on user search behavior in a collection consisting of book metadata taken from the online bookstore Amazon and the social cataloguing application LibraryThing. The collected data represents traditional bibliographic metadata, user-generated tags and reviews and promotional texts and reviews from publishers and professional reviewers. In this year’s experiments we designed two search task categories, which were set to represent two different stages of work task processes. In addition we let the users create a task of their own, which is used as a control task. In the paper we describe the methods used for data collection and the tasks performed by the participants

    The Janus Faced Scholar:a Festschrift in honour of Peter Ingwersen

    Get PDF

    Una teoría cognitiva integral para la recuperación de información: saliendo del entorno del laboratorio

    Get PDF
    The paper demonstrates how the Laboratory Research Framework fits into the integrated Cognitive Framework for IR. It first discusses the Laboratory Framework with emphasis on its underlying assumptions and known limitations. This is followed by a view of interaction and relevance phenomena associated with IR evaluation and central to the understanding of IR. The ensuing section outlines how interactive IR is viewed from a Cognitive Framework, and ‘light’ interactive IR experiments are suggested performed by drawing on the latter framework’s contextual possibilities. These include independent variables drawn from a collection, matching principles in a retrieval system, and the searcher’s situation and task context. The paper ends with concluding points of summarization of issues encountered.Este artículo demuestra cómo el marco de investigación en laboratorio encaja bien dentro del marco cognitivo integral para la Recuperación de información. Se discute primero el marco de investigación en laboratorio, con énfasis en sus asunciones y limitaciones. Se analizan los fenómenos de la interacción y relevancia asociados con la evaluación en RI., así como el modo de desarrollar experimentos interactivos de Recuperación de información dentro del marco cognitivo, considerando la situación del investigador y el contexto de la tarea llevada a cabo
    corecore