4 research outputs found

    Signature file access methodologies for text retrieval: a literature review with additional test cases

    Get PDF
    Signature files are extremely compressed versions of text files which can be used as access or index files to facilitate searching documents for text strings. These access files, or signatures, are generated by storing hashed codes for individual words. Given the possible generation of similar codes in the hashing or storing process, the primary concern in researching signature files is to determine the accuracy of retrieving information. Inaccuracy is always represented by the false signaling of the presence of a text string. Two suggested ways to alter false drop rates are: 1) to determine if either of the two methologies for storing hashed codes, by superimposing them or by concatenating them, is more efficient; and 2) to determine if a particular hashing algorithm has any impact. To assess these issues, the history of suprimposed coding is traced from its development as a tool for compressing information onto punched cards in the 1950s to its incorporation into proposed signature file methodologies in the mid-1980\u27 s. Likewise, the concept of compressing individual words by various algorithms, or by hashing them is traced through the research literature. Following this literature review, benchmark trials are performed using both superimposed and concatenated methodologies while varying hashing algorithms. It is determined that while one combination of hashing algorithm and storage methodology is better, all signature file mehods can be considered viable

    Economic data bank management in a developing nation

    Get PDF
    This dissertation describes the results of a research project which was undertaken at Loughborough University of Technology. The basic objectives of the research project were: (1) to investigate the management elements required for organising the development of an Economic Data Bank (EDB), with particular emphasis on the requirements of a developing nation; (2) to investigate the sociological, political and technical implications associated with organising the development of an EDB in a developing nation. A theoretical framework was established for this study. This was dene after an extensive search and review of literature was performed in the areas of data and data base management systems, management information systems, and computer technology in general. [Continues.
    corecore