959 research outputs found

    Thesaurus-assisted search term selection and query expansion: a review of user-centred studies

    Get PDF
    This paper provides a review of the literature related to the application of domain-specific thesauri in the search and retrieval process. Focusing on studies which adopt a user-centred approach, the review presents a survey of the methodologies and results from empirical studies undertaken on the use of thesauri as sources of term selection for query formulation and expansion during the search process. It summaries the ways in which domain-specific thesauri from different disciplines have been used by various types of users and how these tools aid users in the selection of search terms. The review consists of two main sections covering, firstly studies on thesaurus-aided search term selection and secondly those dealing with query expansion using thesauri. Both sections are illustrated with case studies that have adopted a user-centred approach

    Thesauri on the Web: current developments and trends

    Get PDF
    This article provides an overview of recent developments relating to the application of thesauri in information organisation and retrieval on the World Wide Web. It describes some recent thesaurus projects undertaken to facilitate resource description and discovery and access to wide-ranging information resources on the Internet. Types of thesauri available on the Web, thesauri integrated in databases and information retrieval systems, and multiple-thesaurus systems for cross-database searching are also discussed. Collective efforts and events in addressing the standardisation and novel applications of thesauri are briefly reviewed

    User - Thesaurus Interaction in a Web-Based Database: An Evaluation of Users' Term Selection Behaviour

    Get PDF
    A major challenge faced by users during the information search and retrieval process is the selection of search terms for query formulation and expansion. Thesauri are recognised as one source of search terms which can assist users in query construction and expansion. As the number of electronic thesauri attached to information retrieval systems has grown, a range of interface facilities and features have been developed to aid users in formulating their queries. The pilot study reported here aimed to explore and evaluate how a thesaurus-enhanced search interface assisted end-users in selecting search terms. Specifically, it focused on the evaluation of users' attitudes toward both the thesaurus and its interface as tools for facilitating search term selection for query expansion. Thesaurusbased searching and browsing behaviours adopted by users while interacting with a thesaurus-enhanced search interface were also examined

    Thesaurus-aided learning for rule-based categorization of Ocr texts

    Full text link
    The question posed in this thesis is whether the effectiveness of the rule-based approach to automatic text categorization on OCR collections can be improved by using domain-specific thesauri. A rule-based categorizer was constructed consisting of a C++ program called C-KANT which consults documents and creates a program which can be executed by the CLIPS expert system shell. A series of tests using domain-specific thesauri revealed that a query expansion approach to rule-based automatic text categorization using domain-dependent thesauri will not improve the categorization of OCR texts. Although some improvement to categorization could be made using rules over a mixture of thesauri, the improvements were not significantly large

    Final report of Task #5: Current document index system for document retrieval investigation

    Full text link
    In Part I of this report, we describe the work completed during the last fiscal year (October 1, 2002 thru September 30, 2003). The single biggest challenge this past year has been to develop and deliver a new software technology to classify Homeland Security Sensitive documents with high precision. Not only was a satisfactory system developed, an operational version was delivered to CACI in April 2003. The delivered system is called the Homeland Security Classifier (HSC). In Part II we give an overview of the projects ISRI has completed during the first four years of this cooperative agreement (October 1, 1998 thru September 30, 2002). Each of the deliverables associated with these projects has been thoroughly described in previous reports

    Computer-Aided Knowledge Engineering for Corporate Information Retrieval

    Get PDF
    In 1987, Digital Equipment Corporation's internal Madret Information Services Group I Information Access Services (lAS) decided to build a single thesaurus system to support production and retrieval of multiple applications. This system TIMS (Thesaurus I Indexing Management System) bad to be dynamic and allow for easy modification and merging of volatile business terminology. A faceted approach was used for knowledge-base building and semantic representation. 1be system allowed the knowledge engineer to determine a classification structure and to develop relation types suited to a specific application's requirements

    Retrieval effectiveness for OCR text using thesauri

    Full text link
    This thesis reports on the effects of an automatic query expansion with a subject specific thesaurus on retrieval effectiveness for document collection consisting of OCR text; The investigation encompasses several experiments with a modern retrieval engine based on the probabilistic model. Each experiment is performed on two document collections. The first version of the collection consists of raw OCR output. The second collection consists of the ground truth (retyped from hard copy) version of the same collection; It is shown that the usage of the thesaurus as a source for query expansion can significantly improve recall for Boolean queries, for both OCR and manually corrected document collections. In the case of weighted queries, the expansion has no effect on the average precision and recall. Nevertheless, some individual queries benefit from query expansion

    Review of Indexing Techniques Applied in Information Retrieval

    Get PDF
    Indexing is one of the important tasks of Information Retrieval that can be applied to any form of data, generated from the web, databases, etc. As the size of corpora increases, indexing becomes too time consuming and labor intensive, therefore, the introduction of computer aided indexer. A review of indexing techniques, both human and automatic indexing has been done in this paper. This paper gives an outline of the use of automatic indexing by discussing various hashing techniques including fuzzy finger printing and locality-sensitive hashing. Two different processes of matching that are used in automatic subject indexing are also reviewed. Accepting the need of automatic indexing in a possible replacement to manual indexing, studies in the development of automatic indexing tools must continu
    corecore