185 research outputs found

    Cluster Based Term Weighting Model for Web Document Clustering

    Get PDF
    The term weight is based on the frequency with which the term appears in that document. The term weighting scheme measures the importance of a term with respect to a document and a collection. A term with higher weight is more important than a term with lower weight. A document ranking model uses these term weights to find the rank of a document in a collection. We propose a cluster-based term weighting models based on the TF-IDF model. This term weighting model update the inter-cluster and intra-cluster frequency components uses the generated clusters as a reference in improving the retrieved relevant documents. These inter cluster and intra-cluster frequency components are used for weighting the importance of a term in addition to the term and document frequency components

    A systematic approach to normalization in probabilistic models

    Get PDF
    Open access funding provided by Austrian Science Fund (FWF). This research was partly supported by the Austrian Science Fund (FWF) Project Number P25905-N23 (ADmIRE). This work has been supported by the Self-Optimizer project (FFG 852624) in the EUROSTARS programme, funded by EUREKA, the BMWFW and the European Union

    Evaluation Problems in Interactive Information Retrieval

    Full text link
    Interactive retrieval procedures are normally based on rapidly accessible files. Special storage organizations and file search techniques are used, and the system user is made to fulfill an important role during the retrieval process. In the present study, the interactive retrieval environment is briefly examined. The special problems which arise in the evaluation of interactive retrieval are then discussed, and methods are described for evaluating partial file searches and user feedback techniques. Evaluation resultss obtained with the SMART system are presented

    Experiments in Multi-Lingual Information Retrieval

    Full text link
    A comparison was made of the performance in an automatic information retrieval environment of user queries and document abstracts available in natural language form in both English and French. The results obtained indicate that the automatic indexing and retrieval techniques actually used appear equally effective in handling the query and document texts in both languages

    Automatic text processing: The transformation, analysis, and retrieval of Information by computer

    No full text
    Massachusettsxiii, 530 p.: illus.; 23 c

    A Comparison of Term Value Measurements for Automatic Indexing

    Full text link
    A number of automatic theories have been proposed over the last few years leading to the assignment of significance values to linguistic entities in accordance with their importance for purposes of content representation. Among these are methodologies based on decision theory, information theory, communication theory, vector space transformation and others. An attempt is made to compare these theories by exhibiting the formal frequency characteristics which underlie them. The effectiveness of the various approaches is also evaluated in experimental situations by using collections of documents in the areas of aerodynamics, medicine and world affairs

    Search and Retrieval Experiments in Real-Time Information Retrieval

    Full text link
    Future operating document retrieval systems may be based on fully-automatic information analysis methods instead of manual indexing, and on real-time search procedures which allow the user to interact with the system during the search process. Performance characteristics are first given for fully-automatic information retrieval systems, and comparisons are made with presently operating partly-manual systems. Thereafter, various user-controlled search strategies are described, and the potential of these strategies in improving systems performance is discussed. The evaluation results for the real-time retrieval procedures are used to derive design criteria for future automatic information systems
    corecore