61,851 research outputs found

    On the selection of secondary indices in relational databases

    Get PDF
    An important problem in the physical design of databases is the selection of secondary indices. In general, this problem cannot be solved in an optimal way due to the complexity of the selection process. Often use is made of heuristics such as the well-known ADD and DROP algorithms. In this paper it will be shown that frequently used cost functions can be classified as super- or submodular functions. For these functions several mathematical properties have been derived which reduce the complexity of the index selection problem. These properties will be used to develop a tool for physical database design and also give a mathematical foundation for the success of the before-mentioned ADD and DROP algorithms

    A Genetic Algorithm for solving the Discrete Ordered Median Problem with Induced Order

    Get PDF
    The Discrete Ordered Median Problem with Induced Ordered (DOMP+IO) is a multi-facility version of the classical discrete ordered median problem (DOMP), which has been widely studied. Several exact methods have been proposed to solve the DOMP, however these methods could only solve small-scale problems, which are far of real-life problems. In this work, a DOMP+IO with two kinds of facilities is considered and a heuristic method is proposed for its solving. The proposed procedure is based on a genetic algorithm and the preliminary results show the efficiency and capability to obtain good solutions for large-scale problems.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Secondary Indexing in One Dimension: Beyond B-trees and Bitmap Indexes

    Full text link
    Let S be a finite, ordered alphabet, and let x = x_1 x_2 ... x_n be a string over S. A "secondary index" for x answers alphabet range queries of the form: Given a range [a_l,a_r] over S, return the set I_{[a_l;a_r]} = {i |x_i \in [a_l; a_r]}. Secondary indexes are heavily used in relational databases and scientific data analysis. It is well-known that the obvious solution, storing a dictionary for the position set associated with each character, does not always give optimal query time. In this paper we give the first theoretically optimal data structure for the secondary indexing problem. In the I/O model, the amount of data read when answering a query is within a constant factor of the minimum space needed to represent I_{[a_l;a_r]}, assuming that the size of internal memory is (|S| log n)^{delta} blocks, for some constant delta > 0. The space usage of the data structure is O(n log |S|) bits in the worst case, and we further show how to bound the size of the data structure in terms of the 0-th order entropy of x. We show how to support updates achieving various time-space trade-offs. We also consider an approximate version of the basic secondary indexing problem where a query reports a superset of I_{[a_l;a_r]} containing each element not in I_{[a_l;a_r]} with probability at most epsilon, where epsilon > 0 is the false positive probability. For this problem the amount of data that needs to be read by the query algorithm is reduced to O(|I_{[a_l;a_r]}| log(1/epsilon)) bits.Comment: 16 page

    An Analysis on Selection for High-Resolution Approximations in Many-Objective Optimization

    Get PDF
    This work studies the behavior of three elitist multi- and many-objective evolutionary algorithms generating a high-resolution approximation of the Pareto optimal set. Several search-assessment indicators are defined to trace the dynamics of survival selection and measure the ability to simultaneously keep optimal solutions and discover new ones under different population sizes, set as a fraction of the size of the Pareto optimal set.Comment: apperas in Parallel Problem Solving from Nature - PPSN XIII, Ljubljana : Slovenia (2014

    The financial stress index: identification of systemic risk conditions

    Get PDF
    This paper develops a financial stress index for the United States, the Cleveland Financial Stress Index (CFSI), which provides a continuous signal of financial stress and broad coverage of the areas that could indicate it. The index is based on daily public-market data collected from four sectors of the fi nancial markets—the credit, foreign exchange, equity, and interbank markets. A dynamic weighting method is employed to capture changes in the relative importance of these four sectors as they occur. In addition, the design of the index allows the origin of the stress to be identified. We compare the CFSI to alternative indexes, using a detailed benchmarking methodology, and show how the CFSI can be applied to systemic stress monitoring and early warning system design. To that end, we investigate alternative stress-signaling thresholds and frequency regimes and then establish optimal frequencies for filtering out market noise and idiosyncratic episodes. Finally, we quantify a powerful CFSI-based rating system that assigns a probability of systemic stress to ranges of CFSI outcomes.Systemic risk ; Risk assessment

    CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads

    Get PDF
    Index tuning, i.e., selecting the indexes appropriate for a workload, is a crucial problem in database system tuning. In this paper, we solve index tuning for large problem instances that are common in practice, e.g., thousands of queries in the workload, thousands of candidate indexes and several hard and soft constraints. Our work is the first to reveal that the index tuning problem has a well structured space of solutions, and this space can be explored efficiently with well known techniques from linear optimization. Experimental results demonstrate that our approach outperforms state-of-the-art commercial and research techniques by a significant margin (up to an order of magnitude).Comment: VLDB201
    • …
    corecore