61,851 research outputs found
On the selection of secondary indices in relational databases
An important problem in the physical design of databases is the selection of secondary indices. In general, this problem cannot be solved in an optimal way due to the complexity of the selection process. Often use is made of heuristics such as the well-known ADD and DROP algorithms. In this paper it will be shown that frequently used cost functions can be classified as super- or submodular functions. For these functions several mathematical properties have been derived which reduce the complexity of the index selection problem. These properties will be used to develop a tool for physical database design and also give a mathematical foundation for the success of the before-mentioned ADD and DROP algorithms
A Genetic Algorithm for solving the Discrete Ordered Median Problem with Induced Order
The Discrete Ordered Median Problem with Induced Ordered (DOMP+IO) is a multi-facility
version of the classical discrete ordered median problem (DOMP), which has been widely studied. Several
exact methods have been proposed to solve the DOMP, however these methods could only solve
small-scale problems, which are far of real-life problems. In this work, a DOMP+IO with two kinds
of facilities is considered and a heuristic method is proposed for its solving. The proposed procedure
is based on a genetic algorithm and the preliminary results show the efficiency and capability to obtain
good solutions for large-scale problems.Universidad de Málaga. Campus de Excelencia Internacional AndalucĂa Tech
Secondary Indexing in One Dimension: Beyond B-trees and Bitmap Indexes
Let S be a finite, ordered alphabet, and let x = x_1 x_2 ... x_n be a string
over S. A "secondary index" for x answers alphabet range queries of the form:
Given a range [a_l,a_r] over S, return the set I_{[a_l;a_r]} = {i |x_i \in
[a_l; a_r]}. Secondary indexes are heavily used in relational databases and
scientific data analysis. It is well-known that the obvious solution, storing a
dictionary for the position set associated with each character, does not always
give optimal query time. In this paper we give the first theoretically optimal
data structure for the secondary indexing problem. In the I/O model, the amount
of data read when answering a query is within a constant factor of the minimum
space needed to represent I_{[a_l;a_r]}, assuming that the size of internal
memory is (|S| log n)^{delta} blocks, for some constant delta > 0. The space
usage of the data structure is O(n log |S|) bits in the worst case, and we
further show how to bound the size of the data structure in terms of the 0-th
order entropy of x. We show how to support updates achieving various time-space
trade-offs.
We also consider an approximate version of the basic secondary indexing
problem where a query reports a superset of I_{[a_l;a_r]} containing each
element not in I_{[a_l;a_r]} with probability at most epsilon, where epsilon >
0 is the false positive probability. For this problem the amount of data that
needs to be read by the query algorithm is reduced to O(|I_{[a_l;a_r]}|
log(1/epsilon)) bits.Comment: 16 page
An Analysis on Selection for High-Resolution Approximations in Many-Objective Optimization
This work studies the behavior of three elitist multi- and many-objective
evolutionary algorithms generating a high-resolution approximation of the
Pareto optimal set. Several search-assessment indicators are defined to trace
the dynamics of survival selection and measure the ability to simultaneously
keep optimal solutions and discover new ones under different population sizes,
set as a fraction of the size of the Pareto optimal set.Comment: apperas in Parallel Problem Solving from Nature - PPSN XIII,
Ljubljana : Slovenia (2014
The financial stress index: identification of systemic risk conditions
This paper develops a financial stress index for the United States, the Cleveland Financial Stress Index (CFSI), which provides a continuous signal of financial stress and broad coverage of the areas that could indicate it. The index is based on daily public-market data collected from four sectors of the fi nancial markets—the credit, foreign exchange, equity, and interbank markets. A dynamic weighting method is employed to capture changes in the relative importance of these four sectors as they occur. In addition, the design of the index allows the origin of the stress to be identified. We compare the CFSI to alternative indexes, using a detailed benchmarking methodology, and show how the CFSI can be applied to systemic stress monitoring and early warning system design. To that end, we investigate alternative stress-signaling thresholds and frequency regimes and then establish optimal frequencies for filtering out market noise and idiosyncratic episodes. Finally, we quantify a powerful CFSI-based rating system that assigns a probability of systemic stress to ranges of CFSI outcomes.Systemic risk ; Risk assessment
CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads
Index tuning, i.e., selecting the indexes appropriate for a workload, is a
crucial problem in database system tuning. In this paper, we solve index tuning
for large problem instances that are common in practice, e.g., thousands of
queries in the workload, thousands of candidate indexes and several hard and
soft constraints. Our work is the first to reveal that the index tuning problem
has a well structured space of solutions, and this space can be explored
efficiently with well known techniques from linear optimization. Experimental
results demonstrate that our approach outperforms state-of-the-art commercial
and research techniques by a significant margin (up to an order of magnitude).Comment: VLDB201
- …