31 research outputs found
Fedora and GSearch in a Research Project about Integrated Search
4th International Conference on Open RepositoriesThis presentation was part of the session : Fedora User Group PresentationsDate: 2009-05-21 10:30 AM â 12:00 PMThe Royal School of Library and Information Science in Denmark is performing a research project about integrated search. DTU Library provides assistance in the form of a Fedora and GSearch installation.
The presentation will focus on the technical challenges involved in the setup and indexing of the various sources, facilitating the integrated search.DEFF, Denmark's Electronic Research Librar
Exploiting information needs and bibliographics for polyrepresentative document clustering
In this paper we explore the potential of combining the principle of polyrepresentation with document clustering. Our idea is discussed and evaluated for polyrepresentation of information needs as wells as for document-based polyrepresentation where bibliographic information is used as representation. The main idea is to present the user with the highly ranked polyrepresentative clusters to support the search process. Our evaluation suggests that our approach is capable of increasing retrieval performance, but performance varies for queries with a high or low number of relevant documents
A probabilistic approach for cluster based polyrepresentative information retrieval
A thesis submitted to the University of Bedfordshire in
partial ful lment of the requirements for the degree of
Doctor of PhilosophyDocument clustering in information retrieval (IR) is considered an alternative to rank-based retrieval approaches, because of its potential to support user interactions
beyond just typing in queries. Similarly, the Principle of Polyrepresentation (multi-evidence: combining multiple cognitively and/or functionally diff erent information need or information object representations for improving
an IR system's performance) is an established approach in cognitive IR with plausible applicability in the domain of information seeking and retrieval. The combination of these two approaches can assimilate their respective individual
strengths in order to further improve the performance of IR systems.
The main goal of this study is to combine cognitive and cluster-based IR approaches for improving the eff ectiveness of (interactive) information retrieval systems. In order to achieve this goal, polyrepresentative information retrieval
strategies for cluster browsing and retrieval have been designed, focusing on the evaluation aspect of such strategies.
This thesis addresses the challenge of designing and evaluating an Optimum Clustering Framework (OCF) based model, implementing probabilistic document clustering for interactive IR. Thus, polyrepresentative cluster browsing
strategies have been devised. With these strategies a simulated user based method has been adopted for evaluating the polyrepresentative cluster browsing
and searching strategies.
The proposed approaches are evaluated for information need based polyrepresentative clustering as well as document based polyrepresentation and the combination thereof. For document-based polyrepresentation, the notion of citation
context is exploited, which has special applications in scientometrics and bibliometrics for science literature modelling. The information need polyrepresentation,
on the other hand, utilizes the various aspects of user information need, which is crucial for enhancing the retrieval performance.
Besides describing a probabilistic framework for polyrepresentative document clustering, one of the main fi ndings of this work is that the proposed combination
of the Principle of Polyrepresentation with document clustering has the potential of enhancing the user interactions with an IR system, provided that the various representations of information need and information objects are utilized.
The thesis also explores interactive IR approaches in the context of polyrepresentative interactive information retrieval when it is combined with document clustering methods. Experiments suggest there is a potential in the proposed
cluster-based polyrepresentation approach, since statistically signifi cant improvements were found when comparing the approach to a BM25-based baseline in an ideal scenario. Further marginal improvements were observed when cluster-based re-ranking and cluster-ranking based comparisons were made.
The performance of the approach depends on the underlying information object and information need representations used, which confi rms fi ndings of previous studies where the Principle of Polyrepresentation was applied in diff erent ways
Una teorĂa cognitiva integral para la recuperaciĂłn de informaciĂłn: saliendo del entorno del laboratorio
The paper demonstrates how the Laboratory Research Framework fits into the integrated Cognitive Framework for IR. It first discusses the Laboratory Framework with emphasis on its underlying assumptions and known limitations. This is followed by a view of interaction and relevance phenomena associated with IR evaluation and central to the understanding of IR. The ensuing section outlines how interactive IR is viewed from a Cognitive Framework, and âlightâ interactive IR experiments are suggested performed by drawing on the latter frameworkâs contextual possibilities. These include independent variables drawn from a collection, matching principles in a retrieval system, and the searcherâs situation and task context. The paper ends with concluding points of summarization of issues encountered.Este artĂculo demuestra cĂłmo el marco de investigaciĂłn en laboratorio encaja bien dentro del marco cognitivo integral para la RecuperaciĂłn de informaciĂłn. Se discute primero el marco de investigaciĂłn en laboratorio, con ĂŠnfasis en sus asunciones y limitaciones. Se analizan los fenĂłmenos de la interacciĂłn y relevancia asociados con la evaluaciĂłn en RI., asĂ como el modo de desarrollar experimentos interactivos de RecuperaciĂłn de informaciĂłn dentro del marco cognitivo, considerando la situaciĂłn del investigador y el contexto de la tarea llevada a cabo
From social tagging to polyrepresentation: a study of expert annotating behavior of moving images
MenciĂłn Internacional en el tĂtulo de doctorThis thesis investigates ânichesourcingâ (De Boer, Hildebrand, et al., 2012), an emergent initiative of cultural heritage crowdsoucing in which niches of experts are involved in the annotating tasks. This initiative is studied in relation to moving image annotation, and in the context of audiovisual heritage, more specifically, within the sector of film archives. The work presents a case study of film and media scholars to investigate the types of annotations and attribute descriptions that they could eventually contribute, as well as the information needs, and seeking and searching behaviors of this group, in order to determine what the role of the different types of annotations in supporting their expert tasks would be. The study is composed of three independent but interconnected studies using a mixed methodology and an interpretive approach. It uses concepts from the information behavior discipline, and the "Integrated Information Seeking and Retrieval Framework" (IS&R) (Ingwersen and Järvelin, 2005) as guidance for the investigation. The findings show that there are several types of annotations that moving image experts could contribute to a nichesourcing initiative, of which time-based tags are only one of the possibilities. The findings also indicate that for the different foci in film and media research, in-depth indexing at the content level is only needed for supporting a specific research focus, for supporting research in other domains, or for engaging broader audiences. The main implications at the level of information infrastructure are the requirement for more varied annotating support, more interoperability among existing metadata standards and frameworks, and the need for guidelines about crowdsoucing and nichesourcing implementation in the audiovisual heritage sector. This research presents contributions to the studies of social tagging applied to moving images, to the discipline of information behavior, by proposing new concepts related to the area of use behavior, and to the concept of âpolyrepresentationâ (Ingwersen, 1992, 1996) applied to the humanities domain.Esta tesis investiga la iniciativa del nichesourcing (De Boer, Hildebrand, et al., 2012), como una forma de crowdsoucing en sector del patrimonio cultural, en la cuĂĄl grupos de expertos participan en las tareas de anotaciĂłn de las colecciones. El ĂĄmbito de aplicaciĂłn es la anotaciĂłn de las imĂĄgenes en movimiento en el contexto del patrimonio audiovisual, mĂĄs especĂficamente, en el caso de los archivos fĂlmicos. El trabajo presenta un estudio de caso aplicado a un dominio especĂfico de expertos en el ĂĄmbito audiovisual: los acadĂŠmicos de cine y medios. El anĂĄlisis se centra en dos aspectos especĂficos del problema: los tipos de anotaciones y atributos en las descripciones que podrĂan obtenerse de este nicho de expertos; y en las necesidades de informaciĂłn y el comportamiento informacional de dicho grupo, con el fin de determinar cuĂĄl es el rol de los diferentes tipos de anotaciones en sus tareas de investigaciĂłn. La tesis se compone de tres estudios independientes e interconectados; se usa una metodologĂa mixta e interpretativa. El marco teĂłrico se compone de conceptos del ĂĄrea de estudios de comportamiento informacional (âinformation behaviorâ) y del âMarco integrado de bĂşsqueda y recuperaciĂłn de la informaciĂłnâ ("Integrated Information Seeking and Retrieval Framework" (IS&R)) propuesto por Ingwersen y Järvelin (2005), que sirven de guĂa para la investigaciĂłn. Los hallazgos indican que existen diversas formas de anotaciĂłn de la imagen en movimiento que podrĂan generarse a partir de las contribuciones de expertos, de las cuĂĄles las etiquetas a nivel de plano son sĂłlo una de las posibilidades. Igualmente, se identificaron diversos focos de investigaciĂłn en el ĂĄrea acadĂŠmica de cine y medios. La indexaciĂłn detallada de contenidos sĂłlo es requerida por uno de esos grupos y por investigadores de otras disciplinas, o como forma de involucrar audiencias mĂĄs amplias. Las implicaciones mĂĄs relevantes, a nivel de la infraestructura informacional, se refieren a los requisitos de soporte a formas mĂĄs variadas de anotaciĂłn, el requisito de mayor interoperabilidad de los estĂĄndares y marcos de metadatos, y la necesidad de publicaciĂłn de guĂas de buenas prĂĄcticas sobre de cĂłmo implementar iniciativas de crowdsoucing o nichesourcing en el sector del patrimonio audiovisual. Este trabajo presenta aportes a la investigaciĂłn sobre el etiquetado social aplicado a las imĂĄgenes en movimiento, a la disciplina de estudios del comportamiento informacional, a la que se proponen nuevos conceptos relacionados con el ĂĄrea de uso de la informaciĂłn, y al concepto de âpoli-representaciĂłnâ (Ingwersen, 1992, 1996) en las disciplinas humanĂsticas.Programa Oficial de Doctorado en DocumentaciĂłn: Archivos y Bibliotecas en el Entorno DigitalPresidente: Peter Emil Rerup Ingwersen.- Secretario: Antonio HernĂĄndez PĂŠrez.- Vocal: Nils Phar
The INEX 2010 Interactive Track: An Overview
In the paper we present the organization of the INEX 2010 interactive track. For the 2010 experiments the iTrack has gathered data on user search behavior in a collection consisting of book metadata taken from the online bookstore Amazon and the social cataloguing application LibraryThing. The collected data represents traditional bibliographic metadata, user-generated tags and reviews and promotional texts and reviews from publishers and professional reviewers. In this yearâs experiments we designed two search task categories, which were set to represent two different stages of work task processes. In addition we let the users create a task of their own, which is used as a control task. In the paper we describe the methods used for data collection and the tasks performed by the participants