1,528 research outputs found

    Thesaurus-assisted search term selection and query expansion: a review of user-centred studies

    Get PDF
    This paper provides a review of the literature related to the application of domain-specific thesauri in the search and retrieval process. Focusing on studies which adopt a user-centred approach, the review presents a survey of the methodologies and results from empirical studies undertaken on the use of thesauri as sources of term selection for query formulation and expansion during the search process. It summaries the ways in which domain-specific thesauri from different disciplines have been used by various types of users and how these tools aid users in the selection of search terms. The review consists of two main sections covering, firstly studies on thesaurus-aided search term selection and secondly those dealing with query expansion using thesauri. Both sections are illustrated with case studies that have adopted a user-centred approach

    User - Thesaurus Interaction in a Web-Based Database: An Evaluation of Users' Term Selection Behaviour

    Get PDF
    A major challenge faced by users during the information search and retrieval process is the selection of search terms for query formulation and expansion. Thesauri are recognised as one source of search terms which can assist users in query construction and expansion. As the number of electronic thesauri attached to information retrieval systems has grown, a range of interface facilities and features have been developed to aid users in formulating their queries. The pilot study reported here aimed to explore and evaluate how a thesaurus-enhanced search interface assisted end-users in selecting search terms. Specifically, it focused on the evaluation of users' attitudes toward both the thesaurus and its interface as tools for facilitating search term selection for query expansion. Thesaurusbased searching and browsing behaviours adopted by users while interacting with a thesaurus-enhanced search interface were also examined

    Information retrieval (Part I):Introduction

    Get PDF

    Concept-based Interactive Query Expansion Support Tool (CIQUEST)

    Get PDF
    This report describes a three-year project (2000-03) undertaken in the Information Studies Department at The University of Sheffield and funded by Resource, The Council for Museums, Archives and Libraries. The overall aim of the research was to provide user support for query formulation and reformulation in searching large-scale textual resources including those of the World Wide Web. More specifically the objectives were: to investigate and evaluate methods for the automatic generation and organisation of concepts derived from retrieved document sets, based on statistical methods for term weighting; and to conduct user-based evaluations on the understanding, presentation and retrieval effectiveness of concept structures in selecting candidate terms for interactive query expansion. The TREC test collection formed the basis for the seven evaluative experiments conducted in the course of the project. These formed four distinct phases in the project plan. In the first phase, a series of experiments was conducted to investigate further techniques for concept derivation and hierarchical organisation and structure. The second phase was concerned with user-based validation of the concept structures. Results of phases 1 and 2 informed on the design of the test system and the user interface was developed in phase 3. The final phase entailed a user-based summative evaluation of the CiQuest system. The main findings demonstrate that concept hierarchies can effectively be generated from sets of retrieved documents and displayed to searchers in a meaningful way. The approach provides the searcher with an overview of the contents of the retrieved documents, which in turn facilitates the viewing of documents and selection of the most relevant ones. Concept hierarchies are a good source of terms for query expansion and can improve precision. The extraction of descriptive phrases as an alternative source of terms was also effective. With respect to presentation, cascading menus were easy to browse for selecting terms and for viewing documents. In conclusion the project dissemination programme and future work are outlined

    Retrieval effectiveness for OCR text using thesauri

    Full text link
    This thesis reports on the effects of an automatic query expansion with a subject specific thesaurus on retrieval effectiveness for document collection consisting of OCR text; The investigation encompasses several experiments with a modern retrieval engine based on the probabilistic model. Each experiment is performed on two document collections. The first version of the collection consists of raw OCR output. The second collection consists of the ground truth (retyped from hard copy) version of the same collection; It is shown that the usage of the thesaurus as a source for query expansion can significantly improve recall for Boolean queries, for both OCR and manually corrected document collections. In the case of weighted queries, the expansion has no effect on the average precision and recall. Nevertheless, some individual queries benefit from query expansion

    Associative and Spatial Relationships in Thesaurus-based Retrieval

    Get PDF
    The OASIS (Ontologically Augmented Spatial Information System) project explores terminology systems for thematic and spatial access in digital library applications. A prototype implementation uses data from the Royal Commission on the Ancient and Historical Monuments of Scotland, together with the Getty AAT and TGN thesauri. This paper describes its integrated spatial and thematic schema and discusses novel approaches to the application of thesauri in spatial and thematic semantic distance measures. Semantic distance measures can underpin interactive and automatic query expansion techniques by ranking lists of candidate terms. We first illustrate how hierarchical spatial relationships can be used to provide more flexible retrieval for queries incorporating place names in applications employing online gazetteers and geographical thesauri. We then employ a set of experimental scenarios to investigate key issues affecting use of the associative (RT) thesaurus relationships in semantic distance measures. Previous work has noted the potential of RTs in thesaurus search aids but the problem of increased noise in result sets has been emphasised. Specialising RTs allows the possibility of dynamically linking RT type to query context. Results presented in this paper demonstrate the potential for filtering on the context of the RT link and on subtypes of RT relationships

    BioMeRSA: The Biology media repository with semantic augmentation

    Get PDF
    With computers now capable of easily handling all kinds of multimedia files in vast quantity, and with the Internet now well-suited to exchange these files, we are faced with the challenge of organizing this data in such a way so as to make the information most useful and accessible. This holds true as well for media pertaining to the field of biology, where multimedia is particularly useful in education, as well as in research. To help address this, a software system with a Web-based interface has been developed for improving the accuracy and specificity of multimedia searching and browsing by integrating semantic data pertaining to the field of biology from the Unified Medical Language System (UMLS). Using the Biology Media Repository with Semantic Augmentation (BioMeRSA) system, users who are considered to be `experts\u27 can associate concepts from UMLS with multimedia files submitted by other users to provide semantic context for the files. These annotations are used to retrieve relevant files in the searching and browsing interfaces. A wide variety of image files are currently supported, with some limited support for video and audio files

    Semantic enrichment for enhancing LAM data and supporting digital humanities. Review article

    Get PDF
    With the rapid development of the digital humanities (DH) field, demands for historical and cultural heritage data have generated deep interest in the data provided by libraries, archives, and museums (LAMs). In order to enhance LAM data’s quality and discoverability while enabling a self-sustaining ecosystem, “semantic enrichment” becomes a strategy increasingly used by LAMs during recent years. This article introduces a number of semantic enrichment methods and efforts that can be applied to LAM data at various levels, aiming to support deeper and wider exploration and use of LAM data in DH research. The real cases, research projects, experiments, and pilot studies shared in this article demonstrate endless potential for LAM data, whether they are structured, semi-structured, or unstructured, regardless of what types of original artifacts carry the data. Following their roadmaps would encourage more effective initiatives and strengthen this effort to maximize LAM data’s discoverability, use- and reuse-ability, and their value in the mainstream of DH and Semantic Web
    • …
    corecore