226 research outputs found

    A signature-based indexing method for efficient content-based retrieval of relative temporal patterns

    Get PDF

    A database/knowledge structure for a robotics vision system

    Get PDF
    Desirable properties of robotics vision database systems are given, and structures which possess properties appropriate for some aspects of such database systems are examined. Included in the structures discussed is a family of networks in which link membership is determined by measures of proximity between pairs of the entities stored in the database. This type of network is shown to have properties which guarantee that the search for a matching feature vector is monotonic. That is, the database can be searched with no backtracking, if there is a feature vector in the database which matches the feature vector of the external entity which is to be identified. The construction of the database is discussed, and the search procedure is presented. A section on the support provided by the database for description of the decision-making processes and the search path is also included

    A Framework for Index Bulk Loading and Dynamization

    Get PDF
    In this paper we investigate automated methods for externalizing internal memory data structures. We consider a class of balanced trees that we call weight-balanced partitioning trees (or wp-trees) for indexing a set of points in Rd. Well-known examples of wp-trees include fed-trees, BBD-trees, pseudo quad trees, and BAR trees. These trees are defined with fixed degree and are thus suited for internal memory implementations. Given an efficient wp-tree construction algorithm, we present a general framework for automatically obtaining a new dynamic external data structure. Using this framework together with a new general construction (bulk loading) technique of independent interest, we obtain data structures with guaranteed good update performance in terms of I /O transfers. Our approach gives considerably improved construction and update I/O bounds of e.g. fed-trees and BBD-trees

    Multidimensional bisection: a dual viewpoint

    Get PDF
    AbstractThis paper provides an alternative viewpoint of multidimensional bisection global optimization methods of Wood. A dual coordinate representation of convex bodies is introduced which leads to an easy implementation and eliminates the need to see the geometry of intersecting simplexes. Although developed in the context of global optimization, the techniques deal more generally with regions represented as the union of convex bodies. With this dual framework the algorithm can be implemented efficiently using any multiattribute index data structure that allows for quick range queries. A C version using a “multi-key double linked skip list” based on Pugh's skip list has been implemented

    Advance of the Access Methods

    Get PDF
    The goal of this paper is to outline the advance of the access methods in the last ten years as well as to make review of all available in the accessible bibliography methods

    Signature Files: An Integrated Access Method for Formatted and Unformatted Databases

    Get PDF
    The signature file approach is one of the most powerful information storage and retrieval techniques which is used for finding the data objects that are relevant to the user queries. The main idea of all signature based schemes is to reflect the essence of the data items into bit pattern (descriptors or signatures) and store them in a separate file which acts as a filter to eliminate the non aualifvine data items for an information reauest. It provides an integrated access method for both formattid and formatted databases. A complative overview and discussion of the proposed signatnre generation methods and the major signature file organization schemes are presented. Applications of the signature techniques to formatted and unformatted databases, single and multiterm query cases, serial and paratlei architecture. static and dynamic environments are provided with a special emphasis on the multimedia databases where the pioneering prototype systems using signatnres yield highly encouraging results

    High-dimensional indexing methods utilizing clustering and dimensionality reduction

    Get PDF
    The emergence of novel database applications has resulted in the prevalence of a new paradigm for similarity search. These applications include multimedia databases, medical imaging databases, time series databases, DNA and protein sequence databases, and many others. Features of data objects are extracted and transformed into high-dimensional data points. Searching for objects becomes a search on points in the high-dimensional feature space. The dissimilarity between two objects is determined by the distance between two feature vectors. Similarity search is usually implemented as nearest neighbor search in feature vector spaces. The cost of processing k-nearest neighbor (k-NN) queries via a sequential scan increases as the number of objects and the number of features increase. A variety of multi-dimensional index structures have been proposed to improve the efficiency of k-NN query processing, which work well in low-dimensional space but lose their efficiency in high-dimensional space due to the curse of dimensionality. This inefficiency is dealt in this study by Clustering and Singular Value Decomposition - CSVD with indexing, Persistent Main Memory - PMM index, and Stepwise Dimensionality Increasing - SDI-tree index. CSVD is an approximate nearest neighbor search method. The performance of CSVD with indexing is studied and the approximation to the distance in original space is investigated. For a given Normalized Mean Square Error - NMSE, the higher the degree of clustering, the higher the recall. However, more clusters require more disk page accesses. Certain number of clusters can be obtained to achieve a higher recall while maintaining a relatively lower query processing cost. Clustering and Indexing using Persistent Main Memory - CIPMM framework is motivated by the following consideration: (a) a significant fraction of index pages are accessed randomly, incurring a high positioning time for each access; (b) disk transfer rate is improving 40% annually, while the improvement in positioning time is only 8%; (c) query processing incurs less CPU time for main memory resident than disk resident indices. CIPMM aims at reducing the elapsed time for query processing by utilizing sequential, rather than random disk accesses. A specific instance of the CIPMM framework CIPOP, indexing using Persistent Ordered Partition - OP-tree, is elaborated and compared with clustering and indexing using the SR-tree, CISR. The results show that CIPOP outperforms CISR, and the higher the dimensionality, the higher the performance gains. The SDI-tree index is motivated by fanouts decrease with dimensionality increasing and shorter vectors reduce cache misses. The index is built by using feature vectors transformed via principal component analysis, resulting in a structure with fewer dimensions at higher levels and increasing the number of dimensions from one level to the other. Dimensions are retained in nonincreasing order of their variance according to a parameter p, which specifies the incremental fraction of variance at each level of the index. Experiments on three datasets have shown that SDL-trees with carefully tuned parameters access fewer disk accesses than SR-trees and VAMSR-trees and incur less CPU time than VA-Files in addition

    Directed expected utility networks

    Get PDF
    A variety of statistical graphical models have been defined to represent the conditional independences underlying a random vector of interest. Similarly, many different graphs embedding various types of preferential independences, such as, for example, conditional utility independence and generalized additive independence, have more recently started to appear. In this paper, we define a new graphical model, called a directed expected utility network, whose edges depict both probabilistic and utility conditional independences. These embed a very flexible class of utility models, much larger than those usually conceived in standard influence diagrams. Our graphical representation and various transformations of the original graph into a tree structure are then used to guide fast routines for the computation of a decision problem’s expected utilities. We show that our routines generalize those usually utilized in standard influence diagrams’ evaluations under much more restrictive conditions. We then proceed with the construction of a directed expected utility network to support decision makers in the domain of household food security
    • …