25 research outputs found

    Preliminary Experiments using Subjective Logic for the Polyrepresentation of Information Needs

    According to the principle of polyrepresentation, retrieval accuracy may improve through the combination of multiple and diverse information object representations about e.g. the context of the user, the information sought, or the retrieval system. Recently, the principle of polyrepresentation was mathematically expressed using subjective logic, where the potential suitability of each representation for improving retrieval performance was formalised through degrees of belief and uncertainty. No experimental evidence or practical application has so far validated this model. We extend the work of Lioma et al. (2010), by providing a practical application and analysis of the model. We show how to map the abstract notions of belief and uncertainty to real-life evidence drawn from a retrieval dataset. We also show how to estimate two different types of polyrepresentation assuming either (a) independence or (b) dependence between the information objects that are combined. We focus on the polyrepresentation of different types of context relating to user information needs (i.e. work task, user background knowledge, ideal answer) and show that the subjective logic model can predict their optimal combination prior and independently to the retrieval process

    The Janus Faced Scholar:a Festschrift in honour of Peter Ingwersen

    From social tagging to polyrepresentation: a study of expert annotating behavior of moving images

    Mención Internacional en el título de doctorThis thesis investigates “nichesourcing” (De Boer, Hildebrand, et al., 2012), an emergent initiative of cultural heritage crowdsoucing in which niches of experts are involved in the annotating tasks. This initiative is studied in relation to moving image annotation, and in the context of audiovisual heritage, more specifically, within the sector of film archives. The work presents a case study of film and media scholars to investigate the types of annotations and attribute descriptions that they could eventually contribute, as well as the information needs, and seeking and searching behaviors of this group, in order to determine what the role of the different types of annotations in supporting their expert tasks would be. The study is composed of three independent but interconnected studies using a mixed methodology and an interpretive approach. It uses concepts from the information behavior discipline, and the "Integrated Information Seeking and Retrieval Framework" (IS&R) (Ingwersen and Järvelin, 2005) as guidance for the investigation. The findings show that there are several types of annotations that moving image experts could contribute to a nichesourcing initiative, of which time-based tags are only one of the possibilities. The findings also indicate that for the different foci in film and media research, in-depth indexing at the content level is only needed for supporting a specific research focus, for supporting research in other domains, or for engaging broader audiences. The main implications at the level of information infrastructure are the requirement for more varied annotating support, more interoperability among existing metadata standards and frameworks, and the need for guidelines about crowdsoucing and nichesourcing implementation in the audiovisual heritage sector. This research presents contributions to the studies of social tagging applied to moving images, to the discipline of information behavior, by proposing new concepts related to the area of use behavior, and to the concept of “polyrepresentation” (Ingwersen, 1992, 1996) applied to the humanities domain.Esta tesis investiga la iniciativa del nichesourcing (De Boer, Hildebrand, et al., 2012), como una forma de crowdsoucing en sector del patrimonio cultural, en la cuál grupos de expertos participan en las tareas de anotación de las colecciones. El ámbito de aplicación es la anotación de las imágenes en movimiento en el contexto del patrimonio audiovisual, más específicamente, en el caso de los archivos fílmicos. El trabajo presenta un estudio de caso aplicado a un dominio específico de expertos en el ámbito audiovisual: los académicos de cine y medios. El análisis se centra en dos aspectos específicos del problema: los tipos de anotaciones y atributos en las descripciones que podrían obtenerse de este nicho de expertos; y en las necesidades de información y el comportamiento informacional de dicho grupo, con el fin de determinar cuál es el rol de los diferentes tipos de anotaciones en sus tareas de investigación. La tesis se compone de tres estudios independientes e interconectados; se usa una metodología mixta e interpretativa. El marco teórico se compone de conceptos del área de estudios de comportamiento informacional (“information behavior”) y del “Marco integrado de búsqueda y recuperación de la información” ("Integrated Information Seeking and Retrieval Framework" (IS&R)) propuesto por Ingwersen y Järvelin (2005), que sirven de guía para la investigación. Los hallazgos indican que existen diversas formas de anotación de la imagen en movimiento que podrían generarse a partir de las contribuciones de expertos, de las cuáles las etiquetas a nivel de plano son sólo una de las posibilidades. Igualmente, se identificaron diversos focos de investigación en el área académica de cine y medios. La indexación detallada de contenidos sólo es requerida por uno de esos grupos y por investigadores de otras disciplinas, o como forma de involucrar audiencias más amplias. Las implicaciones más relevantes, a nivel de la infraestructura informacional, se refieren a los requisitos de soporte a formas más variadas de anotación, el requisito de mayor interoperabilidad de los estándares y marcos de metadatos, y la necesidad de publicación de guías de buenas prácticas sobre de cómo implementar iniciativas de crowdsoucing o nichesourcing en el sector del patrimonio audiovisual. Este trabajo presenta aportes a la investigación sobre el etiquetado social aplicado a las imágenes en movimiento, a la disciplina de estudios del comportamiento informacional, a la que se proponen nuevos conceptos relacionados con el área de uso de la información, y al concepto de "poli-representación" (Ingwersen, 1992, 1996) en las disciplinas humanísticas.

    Integrative Levels of Knowing

    Diese Dissertation beschäftigt sich mit einer systematischen Organisation der epistemologischen Dimension des menschlichen Wissens in Bezug auf Perspektiven und Methoden. Insbesondere wird untersucht inwieweit das bekannte Organisationsprinzip der integrativen Ebenen, das eine Hierarchie zunehmender Komplexität und Integration beschreibt, geeignet ist für eine grundlegende Klassifikation von Perspektiven bzw. epistemischen Bezugsrahmen. Die zentrale These dieser Dissertation geht davon aus, dass eine angemessene Analyse solcher epistemischen Kontexte in der Lage sein sollte, unterschiedliche oder gar konfligierende Bezugsrahmen anhand von kontextübergreifenden Standards und Kriterien vergleichen und bewerten zu können. Diese Aufgabe erfordert theoretische und methodologische Grundlagen, welche die Beschränkungen eines radikalen Kontextualismus vermeiden, insbesondere die ihm innewohnende Gefahr einer Fragmentierung des Wissens aufgrund der angeblichen Inkommensurabilität epistemischer Kontexte. Basierend auf Jürgen Habermas‘ Theorie des kommunikativen Handelns und seiner Methodologie des hermeneutischen Rekonstruktionismus, wird argumentiert, dass epistemischer Pluralismus nicht zwangsläufig zu epistemischem Relativismus führen muss und dass eine systematische Organisation der Perspektivenvielfalt von bereits existierenden Modellen zur kognitiven Entwicklung profitieren kann, wie sie etwa in der Psychologie oder den Sozial- und Kulturwissenschaften rekonstruiert werden. Der vorgestellte Ansatz versteht sich als ein Beitrag zur multi-perspektivischen Wissensorganisation, der sowohl neue analytische Werkzeuge für kulturvergleichende Betrachtungen von Wissensorganisationssystemen bereitstellt als auch neue Organisationsprinzipien vorstellt für eine Kontexterschließung, die dazu beitragen kann die Ausdrucksstärke bereits vorhandener Dokumentationssprachen zu erhöhen. Zudem enthält der Anhang eine umfangreiche Zusammenstellung von Modellen integrativer Wissensebenen.This dissertation is concerned with a systematic organization of the epistemological dimension of human knowledge in terms of viewpoints and methods. In particular, it will be explored to what extent the well-known organizing principle of integrative levels that presents a developmental hierarchy of complexity and integration can be applied for a basic classification of viewpoints or epistemic outlooks. The central thesis pursued in this investigation is that an adequate analysis of such epistemic contexts requires tools that allow to compare and evaluate divergent or even conflicting frames of reference according to context-transcending standards and criteria. This task demands a theoretical and methodological foundation that avoids the limitation of radical contextualism and its inherent threat of a fragmentation of knowledge due to the alleged incommensurability of the underlying frames of reference. Based on Jürgen Habermas’s Theory of Communicative Action and his methodology of hermeneutic reconstructionism, it will be argued that epistemic pluralism does not necessarily imply epistemic relativism and that a systematic organization of the multiplicity of perspectives can benefit from already existing models of cognitive development as reconstructed in research fields like psychology, social sciences, and humanities. The proposed cognitive-developmental approach to knowledge organization aims to contribute to a multi-perspective knowledge organization by offering both analytical tools for cross-cultural comparisons of knowledge organization systems (e.g., Seven Epitomes and Dewey Decimal Classification) and organizing principles for context representation that help to improve the expressiveness of existing documentary languages (e.g., Integrative Levels Classification). Additionally, the appendix includes an extensive compilation of conceptions and models of Integrative Levels of Knowing from a broad multidisciplinary field

    A new integrated model for multitasking during web searching

    Investigating multitasking information behaviour, particularly while using the web, has become an increasingly important research area. People s reliance on the web to seek and find information has encouraged a number of researchers to investigate the characteristics of information seeking behaviour and the web seeking strategies used. The current research set out to explore multitasking information behaviour while using the web in relation to people s personal characteristics, working memory, and flow (a state where people feel in control and immersed in the task). Also investigated were the effects of pre-determined knowledge about search tasks and the artefact characteristics. In addition, the study also investigated cognitive states (interactions between the user and the system) and cognitive coordination shifts (the way people change their actions to search effectively) while multitasking on the web. The research was exploratory using a mixed method approach. Thirty University students participated; 10 psychologists, 10 accountants and 10 mechanical engineers. The data collection tools used were: pre and post questionnaires, pre-interviews, a working memory test, a flow state scale test, audio-visual data, web search logs, think aloud data, observation, and the critical decision method. Based on the working memory test, the participants were divided into two groups, those with high scores and those with lower scores. Similarly, participants were divided into two groups based on their flow state scale tests. All participants searched information on the web for four topics: two for which they had prior knowledge and two more without prior knowledge. The results revealed that working memory capacity affects multitasking information behaviour during web searching. For example, the participants in the high working memory group and high flow group had a significantly greater number of cognitive coordination and state shifts than the low working memory group and low flow group. Further, the perception of task complexity was related to working memory capacity; those with low memory capacity thought task complexity increased towards the end of tasks for which they had no prior knowledge compared to tasks for which they had prior knowledge. The results also showed that all participants, regardless of their working memory capacity and flow level, had the same the first frequent cognitive coordination and cognitive state sequences: from strategy to topic. In respect of disciplinary differences, accountants rated task complexity at the end of the web seeking procedure to be statistically less significant for information tasks with prior knowledge compared to the participants from the other disciplines. Moreover, multitasking information behaviour characteristics such as the number of queries, web search sessions and opened tabs/windows during searches has been affected by the disciplines. The findings of the research enabled an exploratory integrated model to be created, which illustrates the nature of multitasking information behaviour when using the web. One other contribution of this research was to develop new more specific and closely grounded definitions of task complexity and artefact characteristics). This new research may influence the creation of more effective web search systems by placing more emphasis on our understanding of the complex cognitive mechanisms of multitasking information behaviour when using the web

    A survey on the use of relevance feedback for information access systems

    Users of online search engines often find it difficult to express their need for information in the form of a query. However, if the user can identify examples of the kind of documents they require then they can employ a technique known as relevance feedback. Relevance feedback covers a range of techniques intended to improve a user's query and facilitate retrieval of information relevant to a user's information need. In this paper we survey relevance feedback techniques. We study both automatic techniques, in which the system modifies the user's query, and interactive techniques, in which the user has control over query modification. We also consider specific interfaces to relevance feedback systems and characteristics of searchers that can affect the use and success of relevance feedback systems

    Interactive Information Retrieval with Structured Documents

    In recent years there has been a growing realisation in the IR community that the interaction of searchers with information is an indispensable component of the IR process. As a result, issues relating to interactive IR have been extensively investigated in the last decade. This research has been performed in the context of unstructured documents or in the context of the loosely-defined structure encountered in web pages. XML documents, on the other hand, define a different context, by offering the possibility of navigating within the structure of a single document, or of following links to other documents. Relatively little work has been carried out to study user interaction with IR systems that make use of the additional features offered by XML documents. As part of the INEX initiative for the evaluation of XML retrieval, the INEX interactive track has focused on interactive XML retrieval since 2004. Here user friendly exposition to various features of XML documents is provided and some new features are designed and implemented to enable searchers to have access to their desired information in an efficient manner. In this study interaction entails three levels: query formulation, inspecting result list, and examining the detail. For query formulation, suggesting related terms is a conventional method to assist searchers. Here we investigate the related terms derived from two different co-occurrence units: elements and documents. In addition, contextual aspect is added to facilitate the searchers for appropriate selection of terms. Results showed the usefulness of suggesting related terms and some what acceptance of the contextual related tool. For inspecting the result list, classic document retrieval systems such as web search engines retrieve whole documents, and leave it to the searchers to collect their required information from possibly a lengthy text. In contrast, element retrieval aims at a focused view of information by pointing to the optimal access points of the document. A number of strategies have been investigated for presenting result lists. For examining the detail of a document, traditionally the complete document is presented to a searcher and here again the searcher has to put in effort to reach its required information. We investigated the use of additional support such as a table of contents along with document detail. In addition, we also investigated graphical representations of documents depicting its structure and granularity of retrieved elements along with their estimated relevance. Here the table of contents was found to be a very useful features for examining details. In order to conduct the analysis of searcher's interaction, a visualisation technique based on Tree Map was developed. It depicts the search interaction with element retrieval system. A number of browsing strategies has been identified with the help of this tool. The value of element retrieval for searchers and comparison between two focused approaches such as element and passage retrieval system was also evaluated. The study suggests that searchers find elements useful for their tasks and they locate a lot of the relevant information in specific elements rather than full documents. Sections, in particular, appear to be helpful. In order to provide user-specific support, the system needs feedback from searchers, who in turn, are very reluctant to give this information explicitly. Therefore, we investigated to what extent the different features can be used as relevance predictors. Of the five features regarded, primarily the reading time is a useful relevance predictor. Overall, relevance predictors for structured documents seem to be much weaker than for the case of atomic documents