2,622 research outputs found

    Incremental Medians via Online Bidding

    Full text link
    In the k-median problem we are given sets of facilities and customers, and distances between them. For a given set F of facilities, the cost of serving a customer u is the minimum distance between u and a facility in F. The goal is to find a set F of k facilities that minimizes the sum, over all customers, of their service costs. Following Mettu and Plaxton, we study the incremental medians problem, where k is not known in advance, and the algorithm produces a nested sequence of facility sets where the kth set has size k. The algorithm is c-cost-competitive if the cost of each set is at most c times the cost of the optimum set of size k. We give improved incremental algorithms for the metric version: an 8-cost-competitive deterministic algorithm, a 2e ~ 5.44-cost-competitive randomized algorithm, a (24+epsilon)-cost-competitive, poly-time deterministic algorithm, and a (6e+epsilon ~ .31)-cost-competitive, poly-time randomized algorithm. The algorithm is s-size-competitive if the cost of the kth set is at most the minimum cost of any set of size k, and has size at most s k. The optimal size-competitive ratios for this problem are 4 (deterministic) and e (randomized). We present the first poly-time O(log m)-size-approximation algorithm for the offline problem and first poly-time O(log m)-size-competitive algorithm for the incremental problem. Our proofs reduce incremental medians to the following online bidding problem: faced with an unknown threshold T, an algorithm submits "bids" until it submits a bid that is at least the threshold. It pays the sum of all its bids. We prove that folklore algorithms for online bidding are optimally competitive.Comment: conference version appeared in LATIN 2006 as "Oblivious Medians via Online Bidding

    Stochastic Database Cracking: Towards Robust Adaptive Indexing in Main-Memory Column-Stores

    Get PDF
    Modern business applications and scientific databases call for inherently dynamic data storage environments. Such environments are characterized by two challenging features: (a) they have little idle system time to devote on physical design; and (b) there is little, if any, a priori workload knowledge, while the query and data workload keeps changing dynamically. In such environments, traditional approaches to index building and maintenance cannot apply. Database cracking has been proposed as a solution that allows on-the-fly physical data reorganization, as a collateral effect of query processing. Cracking aims to continuously and automatically adapt indexes to the workload at hand, without human intervention. Indexes are built incrementally, adaptively, and on demand. Nevertheless, as we show, existing adaptive indexing methods fail to deliver workload-robustness; they perform much better with random workloads than with others. This frailty derives from the inelasticity with which these approaches interpret each query as a hint on how data should be stored. Current cracking schemes blindly reorganize the data within each query's range, even if that results into successive expensive operations with minimal indexing benefit. In this paper, we introduce stochastic cracking, a significantly more resilient approach to adaptive indexing. Stochastic cracking also uses each query as a hint on how to reorganize data, but not blindly so; it gains resilience and avoids performance bottlenecks by deliberately applying certain arbitrary choices in its decision-making. Thereby, we bring adaptive indexing forward to a mature formulation that confers the workload-robustness previous approaches lacked. Our extensive experimental study verifies that stochastic cracking maintains the desired properties of original database cracking while at the same time it performs well with diverse realistic workloads.Comment: VLDB201

    General Bounds for Incremental Maximization

    Full text link
    We propose a theoretical framework to capture incremental solutions to cardinality constrained maximization problems. The defining characteristic of our framework is that the cardinality/support of the solution is bounded by a value kNk\in\mathbb{N} that grows over time, and we allow the solution to be extended one element at a time. We investigate the best-possible competitive ratio of such an incremental solution, i.e., the worst ratio over all kk between the incremental solution after kk steps and an optimum solution of cardinality kk. We define a large class of problems that contains many important cardinality constrained maximization problems like maximum matching, knapsack, and packing/covering problems. We provide a general 2.6182.618-competitive incremental algorithm for this class of problems, and show that no algorithm can have competitive ratio below 2.182.18 in general. In the second part of the paper, we focus on the inherently incremental greedy algorithm that increases the objective value as much as possible in each step. This algorithm is known to be 1.581.58-competitive for submodular objective functions, but it has unbounded competitive ratio for the class of incremental problems mentioned above. We define a relaxed submodularity condition for the objective function, capturing problems like maximum (weighted) (bb-)matching and a variant of the maximum flow problem. We show that the greedy algorithm has competitive ratio (exactly) 2.3132.313 for the class of problems that satisfy this relaxed submodularity condition. Note that our upper bounds on the competitive ratios translate to approximation ratios for the underlying cardinality constrained problems.Comment: fixed typo

    Willingness to Pay: Referendum Contingent Valuation and Uncertain Project Benefits

    Get PDF
    This study uses contingent valuation (CV) methods to estimate the benefit of an environmental water quality project of the Tietê River and its tributaries that flow through the São Paulo, Brazil, Metropolitan Area (SPMA). This paper demonstrates the range alternative central tendency measures for WTP produced under alternative parametric and nonparametric approaches using data gathered from a recent referendum CV survey that was conducted in Brazil to analyze a large, multi-phase water quality improvement project. It explains why one of the most commonly used measures, the unrestricted mean of the conditional inverse distribution function of WTP, may be less desirable and more computationally intensive than simpler alternatives like the nonparametric mean of the marginal inverse distribution function.Water management, Economics, contingent valuation, econometric models, environmental impact analysis, economic development projects

    Fully Dynamic Single-Source Reachability in Practice: An Experimental Study

    Full text link
    Given a directed graph and a source vertex, the fully dynamic single-source reachability problem is to maintain the set of vertices that are reachable from the given vertex, subject to edge deletions and insertions. It is one of the most fundamental problems on graphs and appears directly or indirectly in many and varied applications. While there has been theoretical work on this problem, showing both linear conditional lower bounds for the fully dynamic problem and insertions-only and deletions-only upper bounds beating these conditional lower bounds, there has been no experimental study that compares the performance of fully dynamic reachability algorithms in practice. Previous experimental studies in this area concentrated only on the more general all-pairs reachability or transitive closure problem and did not use real-world dynamic graphs. In this paper, we bridge this gap by empirically studying an extensive set of algorithms for the single-source reachability problem in the fully dynamic setting. In particular, we design several fully dynamic variants of well-known approaches to obtain and maintain reachability information with respect to a distinguished source. Moreover, we extend the existing insertions-only or deletions-only upper bounds into fully dynamic algorithms. Even though the worst-case time per operation of all the fully dynamic algorithms we evaluate is at least linear in the number of edges in the graph (as is to be expected given the conditional lower bounds) we show in our extensive experimental evaluation that their performance differs greatly, both on generated as well as on real-world instances

    The reverse greedy algorithm for the metric k-median problem

    Full text link
    The Reverse Greedy algorithm (RGreedy) for the k-median problem works as follows. It starts by placing facilities on all nodes. At each step, it removes a facility to minimize the resulting total distance from the customers to the remaining facilities. It stops when k facilities remain. We prove that, if the distance function is metric, then the approximation ratio of RGreedy is between ?(log n/ log log n) and O(log n).Comment: to appear in IPL. preliminary version in COCOON '0

    The Miracle of Compound Interest: Does our Intuition Fail?

    Get PDF
    When it comes to estimating the benefits of long-term savings, many people rely on their intuition. Focusing on the domain of retirement savings, we use a randomized experiment to explore people’s intuition about how money accumulates over time. We ask half of our sample to estimate future consumption given savings (the forward perspective). The other half of the sample is asked to estimate savings given future consumption (the backward perspective). From an economic point of view, both subsamples are asked identical questions. However, we discover a large “direction bias”: the perceived benefits of long-term savings are substantially higher when individuals adopt a backward perspective. Our findings have important impli- cations for economic modeling, in general, and for structuring advice and financial literacy programs, in particular.Behavioral economics;financial intuition;financial literacy;com- pound interest;retirement saving

    Seismic Response of a Tall Building to Recorded and Simulated Ground Motions

    Get PDF
    Seismological modeling technologies are advancing to the stage of enabling fundamental simulation of earthquake fault ruptures, which offer new opportunities to simulate extreme ground motions for collapse safety assessment and earthquake scenarios for community resilience studies. With the goal toward establishing the reliability of simulated ground motions for performance-based engineering, this paper examines the response of a 20-story concrete moment frame building analyzed by nonlinear dynamic analysis under corresponding sets of recorded and simulated ground motions. The simulated ground motions were obtained through a larger validation study via the Southern California Earthquake Center (SCEC) Broadband Platform (BBP) that simulates magnitude 5.9 to 7.3 earthquakes. Spectral shape and significant duration are considered when selecting ground motions in the development of comparable sets of simulated and recorded ground motions. Structural response is examined at different intensity levels up to collapse, to investigate whether a statistically significant difference exists between the responses to simulated and recorded ground motions. Results indicate that responses to simulated and recorded ground motions are generally similar at intensity levels prior to observation of collapses. Collapse capacities are also in good agreement for this structure. However, when the structure was made more sensitive to effects of ground motion duration, the differences between observed collapse responses increased. Research is ongoing to illuminate reasons for the difference and whether there is a systematic bias in the results that can be traced back to the ground motion simulation techniques

    Estimating Future Consumer Welfare Gains from Innovation: The Case of Digital Data Storage

    Get PDF
    We develop a quality-adjusted cost index to estimate expected returns to investments in new technologies. The index addresses the problem of measuring social benefits from innovations in service sector inputs, where real output is not directly observable. We forecast welfare gains from two U.S. Advanced Technology Program innovations equaling 25%-50% of expected price, and aggregate consumer benefits of 11-2 billion, relative to trends in existing technologies. Our model’s probabilistic parameters reflect uncertainty about prospective outcomes and in our hedonic estimates of shadow values for selected product attributes. The index can be readily adopted by research and development (R&D) managers in industry and government.
    corecore