3,928 research outputs found

    Thesaurus-assisted search term selection and query expansion: a review of user-centred studies

    Get PDF
    This paper provides a review of the literature related to the application of domain-specific thesauri in the search and retrieval process. Focusing on studies which adopt a user-centred approach, the review presents a survey of the methodologies and results from empirical studies undertaken on the use of thesauri as sources of term selection for query formulation and expansion during the search process. It summaries the ways in which domain-specific thesauri from different disciplines have been used by various types of users and how these tools aid users in the selection of search terms. The review consists of two main sections covering, firstly studies on thesaurus-aided search term selection and secondly those dealing with query expansion using thesauri. Both sections are illustrated with case studies that have adopted a user-centred approach

    User - Thesaurus Interaction in a Web-Based Database: An Evaluation of Users' Term Selection Behaviour

    Get PDF
    A major challenge faced by users during the information search and retrieval process is the selection of search terms for query formulation and expansion. Thesauri are recognised as one source of search terms which can assist users in query construction and expansion. As the number of electronic thesauri attached to information retrieval systems has grown, a range of interface facilities and features have been developed to aid users in formulating their queries. The pilot study reported here aimed to explore and evaluate how a thesaurus-enhanced search interface assisted end-users in selecting search terms. Specifically, it focused on the evaluation of users' attitudes toward both the thesaurus and its interface as tools for facilitating search term selection for query expansion. Thesaurusbased searching and browsing behaviours adopted by users while interacting with a thesaurus-enhanced search interface were also examined

    Query Expansion for Survey Question Retrieval in the Social Sciences

    Full text link
    In recent years, the importance of research data and the need to archive and to share it in the scientific community have increased enormously. This introduces a whole new set of challenges for digital libraries. In the social sciences typical research data sets consist of surveys and questionnaires. In this paper we focus on the use case of social science survey question reuse and on mechanisms to support users in the query formulation for data sets. We describe and evaluate thesaurus- and co-occurrence-based approaches for query expansion to improve retrieval quality in digital libraries and research data archives. The challenge here is to translate the information need and the underlying sociological phenomena into proper queries. As we can show retrieval quality can be improved by adding related terms to the queries. In a direct comparison automatically expanded queries using extracted co-occurring terms can provide better results than queries manually reformulated by a domain expert and better results than a keyword-based BM25 baseline.Comment: to appear in Proceedings of 19th International Conference on Theory and Practice of Digital Libraries 2015 (TPDL 2015

    Machine Learning of User Profiles: Representational Issues

    Full text link
    As more information becomes available electronically, tools for finding information of interest to users becomes increasingly important. The goal of the research described here is to build a system for generating comprehensible user profiles that accurately capture user interest with minimum user interaction. The research described here focuses on the importance of a suitable generalization hierarchy and representation for learning profiles which are predictively accurate and comprehensible. In our experiments we evaluated both traditional features based on weighted term vectors as well as subject features corresponding to categories which could be drawn from a thesaurus. Our experiments, conducted in the context of a content-based profiling system for on-line newspapers on the World Wide Web (the IDD News Browser), demonstrate the importance of a generalization hierarchy and the promise of combining natural language processing techniques with machine learning (ML) to address an information retrieval (IR) problem.Comment: 6 page

    Retrieving with good sense

    Get PDF
    Although always present in text, word sense ambiguity only recently became regarded as a problem to information retrieval which was potentially solvable. The growth of interest in word senses resulted from new directions taken in disambiguation research. This paper first outlines this research and surveys the resulting efforts in information retrieval. Although the majority of attempts to improve retrieval effectiveness were unsuccessful, much was learnt from the research. Most notably a notion of under what circumstance disambiguation may prove of use to retrieval

    Adaptive content mapping for internet navigation

    Get PDF
    The Internet as the biggest human library ever assembled keeps on growing. Although all kinds of information carriers (e.g. audio/video/hybrid file formats) are available, text based documents dominate. It is estimated that about 80% of all information worldwide stored electronically exists in (or can be converted into) text form. More and more, all kinds of documents are generated by means of a text processing system and are therefore available electronically. Nowadays, many printed journals are also published online and may even discontinue to appear in print form tomorrow. This development has many convincing advantages: the documents are both available faster (cf. prepress services) and cheaper, they can be searched more easily, the physical storage only needs a fraction of the space previously necessary and the medium will not age. For most people, fast and easy access is the most interesting feature of the new age; computer-aided search for specific documents or Web pages becomes the basic tool for information-oriented work. But this tool has problems. The current keyword based search machines available on the Internet are not really appropriate for such a task; either there are (way) too many documents matching the specified keywords are presented or none at all. The problem lies in the fact that it is often very difficult to choose appropriate terms describing the desired topic in the first place. This contribution discusses the current state-of-the-art techniques in content-based searching (along with common visualization/browsing approaches) and proposes a particular adaptive solution for intuitive Internet document navigation, which not only enables the user to provide full texts instead of manually selected keywords (if available), but also allows him/her to explore the whole database
    • …
    corecore