44,051 research outputs found

    Synsets improve short text clustering for search support: combining LDA and WordNet

    Get PDF
    In this study, I proposed a short text clustering approach with WordNet as the external resources to cluster documents from corpus.byu.edu. Experimental results show that our approach largely improved the clustering performance. The factors that have an influence on the performance of the topic model are the total number of documents, Synsets distribution among topics and words overlapping between the query’s Synsets. In addition, the performance will also be influenced by the missing Synset in WordNet. Finally, we provide an idea of using clustering approaches generating ranked query suggestion to disambiguate the query. Combining with Synsets of the query, text document clustering can provide an effective way to disambiguate user search query by organizing a large set of searching results into a small number of groups labeled with Synsets from WordNet.Master of Science in Information Scienc

    Thesaurus-assisted search term selection and query expansion: a review of user-centred studies

    Get PDF
    This paper provides a review of the literature related to the application of domain-specific thesauri in the search and retrieval process. Focusing on studies which adopt a user-centred approach, the review presents a survey of the methodologies and results from empirical studies undertaken on the use of thesauri as sources of term selection for query formulation and expansion during the search process. It summaries the ways in which domain-specific thesauri from different disciplines have been used by various types of users and how these tools aid users in the selection of search terms. The review consists of two main sections covering, firstly studies on thesaurus-aided search term selection and secondly those dealing with query expansion using thesauri. Both sections are illustrated with case studies that have adopted a user-centred approach

    CONTEXT-BASED AUTOSUGGEST ON GRAPH DATA

    Get PDF
    Autosuggest is an important feature in any search applications. Currently, most applications only suggest a single term based on how frequent that term appears in the indexed documents or how often it is searched upon. These approaches might not provide the most relevant suggestions because users often enter a series of related query terms to answer a question they have in mind. In this project, we implemented the Smart Solr Suggester plugin using a context-based approach that takes into account the relationships among search keywords. In particular, we used the keywords that the user has chosen so far in the search text box as the context to autosuggest their next incomplete keyword. This context-based approach uses the relationships between entities in the graph data that the user is searching on and therefore would provide more meaningful suggestions

    Query Expansion for Survey Question Retrieval in the Social Sciences

    Full text link
    In recent years, the importance of research data and the need to archive and to share it in the scientific community have increased enormously. This introduces a whole new set of challenges for digital libraries. In the social sciences typical research data sets consist of surveys and questionnaires. In this paper we focus on the use case of social science survey question reuse and on mechanisms to support users in the query formulation for data sets. We describe and evaluate thesaurus- and co-occurrence-based approaches for query expansion to improve retrieval quality in digital libraries and research data archives. The challenge here is to translate the information need and the underlying sociological phenomena into proper queries. As we can show retrieval quality can be improved by adding related terms to the queries. In a direct comparison automatically expanded queries using extracted co-occurring terms can provide better results than queries manually reformulated by a domain expert and better results than a keyword-based BM25 baseline.Comment: to appear in Proceedings of 19th International Conference on Theory and Practice of Digital Libraries 2015 (TPDL 2015

    Assessing Visualization Techniques for the Search Process in Digital Libraries

    Full text link
    In this paper we present an overview of several visualization techniques to support the search process in Digital Libraries (DLs). The search process typically can be separated into three major phases: query formulation and refinement, browsing through result lists and viewing and interacting with documents and their properties. We discuss a selection of popular visualization techniques that have been developed for the different phases to support the user during the search process. Along prototypes based on the different techniques we show how the approaches have been implemented. Although various visualizations have been developed in prototypical systems very few of these approaches have been adapted into today's DLs. We conclude that this is most likely due to the fact that most systems are not evaluated intensely in real-life scenarios with real information seekers and that results of the interesting visualization techniques are often not comparable. We can say that many of the assessed systems did not properly address the information need of cur-rent users.Comment: 23 pages, 14 figures, pre-print to appear in "Wissensorganisation mit digitalen Technologien" (deGruyter

    Deriving query suggestions for site search

    Get PDF
    Modern search engines have been moving away from simplistic interfaces that aimed at satisfying a user's need with a single-shot query. Interactive features are now integral parts of web search engines. However, generating good query modification suggestions remains a challenging issue. Query log analysis is one of the major strands of work in this direction. Although much research has been performed on query logs collected on the web as a whole, query log analysis to enhance search on smaller and more focused collections has attracted less attention, despite its increasing practical importance. In this article, we report on a systematic study of different query modification methods applied to a substantial query log collected on a local website that already uses an interactive search engine. We conducted experiments in which we asked users to assess the relevance of potential query modification suggestions that have been constructed using a range of log analysis methods and different baseline approaches. The experimental results demonstrate the usefulness of log analysis to extract query modification suggestions. Furthermore, our experiments demonstrate that a more fine-grained approach than grouping search requests into sessions allows for extraction of better refinement terms from query log files. © 2013 ASIS&T
    • …
    corecore