Search CORE

4,691 research outputs found

Using correlation matrix memories for inferencing in expert systems

Author: Austin J
Filer R
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 01/01/1996
Field of study

Outline of The Chapter… Section 16.2 describes CMM and the Dynamic Variable Binding Problem. Section 16.3 deals with how CMM is used as part of an inferencing engine. Section 16.4 details the important performance characteristics of CMM

White Rose Research Online

Distributed associative memories for high-speed symbolic reasoning

Author: Austin J
Publication venue: 'Elsevier BV'
Publication date: 01/09/1996
Field of study

This paper briefly introduces a novel symbolic reasoning system based upon distributed associative memories which are constructed from correlation matrix memories (CMM). The system is aimed at high-speed rule-based symbolic operations. It has the advantage of very fast rule matching without the long training times normally associated with neural-network-based symbolic manipulation systems. In particular, the network is able to perform partial matching on symbolic information at high speed. As such, the system is aimed at the practical use of neural networks in high-speed reasoning systems. The paper describes the advantages and disadvantages of using CMM and shows how the approach overcomes those disadvantages. It then briefly describes a system incorporating CMM

White Rose Research Online

Signature Files: An Integrated Access Method for Formatted and Unformatted Databases

Author: Aktug Deniz
Can Fazli
Publication venue
Publication date: 01/05/1993
Field of study

The signature file approach is one of the most powerful information storage and retrieval techniques which is used for finding the data objects that are relevant to the user queries. The main idea of all signature based schemes is to reflect the essence of the data items into bit pattern (descriptors or signatures) and store them in a separate file which acts as a filter to eliminate the non aualifvine data items for an information reauest. It provides an integrated access method for both formattid and formatted databases. A complative overview and discussion of the proposed signatnre generation methods and the major signature file organization schemes are presented. Applications of the signature techniques to formatted and unformatted databases, single and multiterm query cases, serial and paratlei architecture. static and dynamic environments are provided with a special emphasis on the multimedia databases where the pioneering prototype systems using signatnres yield highly encouraging results

Scholarly Commons @ MiamiOH (Miami University)

An evaluation of standard retrieval algorithms and a binary neural approach

Author: Austin J.
Hodge V.J.
Publication venue: 'Elsevier BV'
Publication date: 01/04/2001
Field of study

In this paper we evaluate a selection of data retrieval algorithms for storage efficiency, retrieval speed and partial matching capabilities using a large Information Retrieval dataset. We evaluate standard data structures, for example inverted file lists and hash tables, but also a novel binary neural network that incorporates: single-epoch training, superimposed coding and associative matching in a binary matrix data structure. We identify the strengths and weaknesses of the approaches. From our evaluation, the novel neural network approach is superior with respect to training speed and partial match retrieval time. From the results, we make recommendations for the appropriate usage of the novel neural approach. (C) 2001 Elsevier Science Ltd. All rights reserved

White Rose Research Online

Signature file access methodologies for text retrieval: a literature review with additional test cases

Author: Caviglia Karen
Publication venue: RIT Scholar Works
Publication date: 01/01/1987
Field of study

Signature files are extremely compressed versions of text files which can be used as access or index files to facilitate searching documents for text strings. These access files, or signatures, are generated by storing hashed codes for individual words. Given the possible generation of similar codes in the hashing or storing process, the primary concern in researching signature files is to determine the accuracy of retrieving information. Inaccuracy is always represented by the false signaling of the presence of a text string. Two suggested ways to alter false drop rates are: 1) to determine if either of the two methologies for storing hashed codes, by superimposing them or by concatenating them, is more efficient; and 2) to determine if a particular hashing algorithm has any impact. To assess these issues, the history of suprimposed coding is traced from its development as a tool for compressing information onto punched cards in the 1950s to its incorporation into proposed signature file methodologies in the mid-1980\u27 s. Likewise, the concept of compressing individual words by various algorithms, or by hashing them is traced through the research literature. Following this literature review, benchmark trials are performed using both superimposed and concatenated methodologies while varying hashing algorithms. It is determined that while one combination of hashing algorithm and storage methodology is better, all signature file mehods can be considered viable

RIT Scholar Works

Analysis of Signature Generation Schemes for Multiterm Queries In Linear Hashing with Superimposed Signatures

Author: Can Fazli
Ertugay Osman
Publication venue
Publication date: 01/12/1995
Field of study

Signature files provide efficient retrieval of data by reflecting the essence of the data objects into bit patterns. Our analysis explores the performance of three superimposed signature generation schemes as they are applied to a dynamic signature file organization based on linear hashing: Linear Hashing with Superimposed Signatures (LHSS). The first scheme (SM) allows all terms set the same number of bits whereas the second and third schemes (MMS aid MMM) emphasize the terms with high discriminatory power. In addition, MMM considers the probability distribution of the number of query terms. The main contribution of the study is a detailed analysis of LHSS in multiterm query environments by incorporating the term discrimination values based on document and query frequencies. The approach of the study can also be extended to other signature file access methods based on partitioning. The derivation of the performance evaluation formulas, the simulation results based on these formulas for various experimental settings, and the implementation results based on INSPEC and NPL text databases are provided. Results indicate that MMM and MMS outperform SM in all cases in terms of access savings, especially when terms become more distinctive. MMM slightly outperforms MMS in high weight and low weight query cases. The performance gap among all three schemes decreases as the database size increases, and as the signature size increases the performances of MMM and MMS decrease and converge to that of the SM scheme when the hashing level is fixed

Scholarly Commons @ MiamiOH (Miami University)

Analysis of Multiterm Queries in Partitioned Signature File Environments

Author: Aktug Deniz
Publication venue
Publication date: 01/04/1993
Field of study

The concern of this study is the signature files which are used for information storage and retrieval in both formatted and unformatted databases. The analysis combines the concerns of signature extraction and signature file organization which have usually been treated as separate issues. Both the uniform frequency and single term query assumptions are relaxed and a comprehensive analysis is presented for multiterm query environments where terms can be classified based on their query and database occurrence frequencies. The performance of three superimposed signature generation schemes is explored as they are applied to a dynamic signature file organization based on linear hashing: Linear Hashing with Superimposed Signatures (LHSS). First scheme (SM) allows all terms set the same number of bits regardless of their discriminatory power whereas the second and third methods (MMS and MMM) emphasize the terms with high query and low database ooccurrence frequencies. Of these three schemes, only MMM takes the probability distribution of the number of query terms into account in finding the optimal mapping strategy. The main contribution of the study is the derivation of the performance evaluation formulas which is provided together with the analysis of various experimental settings. Results indicate that MMM outperforms the other methods as the gap between the discriminatory power of the terms gets larger. The absolute value of the savings provided by MMM reaches a maximum for the high query weight case. However, the extra savings decline sharply for high weight and moderately for the low weight queries with the increase in database size. The applicability of the derivations to other partitioned signature organizations is discussed and a detailed analysis of Fixed Prefix Partitioning (FPP) is provided as an example. An approximate formula that is shown to estimate the performance of both FPP and LHSS within an acceptable margin of error is also modified to account for the multiterm case

Scholarly Commons @ MiamiOH (Miami University)

Vertical framing of superimposed signature files using partial evaluation of queries

Author: Can F.
Kocberber A. S.
Publication venue: 'Elsevier BV'
Publication date: 01/05/1997
Field of study

Cataloged from PDF version of article.A new signature file method, Multi-Frame Signature File (MFSF), is introduced by extending the bit-sliced signature file method. In MFSF a signature file is divided into variable sized vertical frames with different on-bit densities to optimize the response time using a partial query evaluation methodology. In query evaluation the on-bits of the lower on-bit density frames are used first. As the number of query terms increases, the number of query signature on-bits in the lower on-bit density frames increases and the query stopping condition is reached in fewer evaluation steps. Therefore, in MFSF, the query evaluation time decreases for increasing numbers of query terms. Under the sequentiality assumption of disk blocks, in a PC environment with 30 ms average disk seek time, MFSF provides a projected worst-case response time of 3.54 seconds for a database size of one million records in a uniform distribution multi-term query environment with 1-5 terms per query. Due to partial evaluation, this desired response time is guaranteed for queries with several terms. The comparison of MFSF with the inverted file approach shows that MFSF provides promising research opportunities. (C) 1997 Elsevier Science Ltd

Bilkent University Institutional Repository

Analysis of Signature Generation Schemes for Multiterm Queries In Partitioned Signature File Environments

Author: Aktug Deniz
Can Fazli
Publication venue
Publication date: 01/05/1993
Field of study

Our analysis explores the performance of three superimposed signature generation schemes as they are applied to a dynamic sigrtature file organization based on linear hashing: Linear Hashing with Superinzposed Signatures (LHSS). First scheme (SM) allows all terms set the same number of bits whereas the second and third methods (MMS and MMM) emphasize the terms with hlgh discriminatory power. In addition, M Mco nsiders the probaOiZity distribution of the number of query terms. The main contribution of the study is the combination of signature generation and signature file organization concepts together with the relaxation of the single term query and uniform frequency assumptions. The derivation of the performance evaluation formulas are provided as well as the analysis of various experimental settings. Results indicate that MMM outperforms the others as terms become more distinctive in their discriminatory power. MMM accomplishes the highest savings in retrieval eficiency for the high query weight case. We also discuss the applicability of the derivations to other partitioned signature organizations providing a detailed analysis of Fixed Prefix Partitioning (FPP) as an example. Finally, an appro.ximate perfortnance evaluation formula that works for both FPP and LHSS is modijied to account for the multiterm case

Scholarly Commons @ MiamiOH (Miami University)

Indexing a Fuzzy Database Using the Technique of Superimposed Coding - Cost Models and Measurements

Author: Boss Birgit
Helmer Sven
Publication venue
Publication date: 01/01/1996
Field of study

Recently, new applications have emerged that require database management systems with uncertainty capabilities. Many of the existing approaches to modelling uncertainty in database management systems are based on the theory of fuzzy sets. High performance is a necessary precondition for the acceptance of such systems by end users. However, performance issues have been quite neglected in research on fuzzy database management systems so far. In this article they are addressed explicitly. We propose new index structures for fuzzy database management systems based on the well known technique of superimposed coding together with detailed cost models. The correctness of the cost models as well as the efficiency of the index structures proposed is validated by a number of measurements on experimental fuzzy databases

MAnnheim DOCument Server