880 research outputs found

    Scaling and entropy in p-median facility location along a line

    Full text link
    The p-median problem is a common model for optimal facility location. The task is to place p facilities (e.g., warehouses or schools) in a heterogeneously populated space such that the average distance from a person's home to the nearest facility is minimized. Here we study the special case where the population lives along a line (e.g., a road or a river). If facilities are optimally placed, the length of the line segment served by a facility is inversely proportional to the square root of the population density. This scaling law is derived analytically and confirmed for concrete numerical examples of three US Interstate highways and the Mississippi River. If facility locations are permitted to deviate from the optimum, the number of possible solutions increases dramatically. Using Monte Carlo simulations, we compute how scaling is affected by an increase in the average distance to the nearest facility. We find that the scaling exponents change and are most sensitive near the optimum facility distribution.Comment: 7 pages, 6 figures, Physical Review E, in pres

    Improved Algorithms for Time Decay Streams

    Get PDF
    In the time-decay model for data streams, elements of an underlying data set arrive sequentially with the recently arrived elements being more important. A common approach for handling large data sets is to maintain a coreset, a succinct summary of the processed data that allows approximate recovery of a predetermined query. We provide a general framework that takes any offline-coreset and gives a time-decay coreset for polynomial time decay functions. We also consider the exponential time decay model for k-median clustering, where we provide a constant factor approximation algorithm that utilizes the online facility location algorithm. Our algorithm stores O(k log(h Delta)+h) points where h is the half-life of the decay function and Delta is the aspect ratio of the dataset. Our techniques extend to k-means clustering and M-estimators as well

    Facility Location in the Sublinear Geometric Model

    Get PDF

    Online Mixed Packing and Covering

    Full text link
    In many problems, the inputs arrive over time, and must be dealt with irrevocably when they arrive. Such problems are online problems. A common method of solving online problems is to first solve the corresponding linear program, and then round the fractional solution online to obtain an integral solution. We give algorithms for solving linear programs with mixed packing and covering constraints online. We first consider mixed packing and covering linear programs, where packing constraints are given offline and covering constraints are received online. The objective is to minimize the maximum multiplicative factor by which any packing constraint is violated, while satisfying the covering constraints. No prior sublinear competitive algorithms are known for this problem. We give the first such --- a polylogarithmic-competitive algorithm for solving mixed packing and covering linear programs online. We also show a nearly tight lower bound. Our techniques for the upper bound use an exponential penalty function in conjunction with multiplicative updates. While exponential penalty functions are used previously to solve linear programs offline approximately, offline algorithms know the constraints beforehand and can optimize greedily. In contrast, when constraints arrive online, updates need to be more complex. We apply our techniques to solve two online fixed-charge problems with congestion. These problems are motivated by applications in machine scheduling and facility location. The linear program for these problems is more complicated than mixed packing and covering, and presents unique challenges. We show that our techniques combined with a randomized rounding procedure give polylogarithmic-competitive integral solutions. These problems generalize online set-cover, for which there is a polylogarithmic lower bound. Hence, our results are close to tight

    A lower bound for metric 1-median selection

    Full text link
    Consider the problem of finding a point in an n-point metric space with the minimum average distance to all points. We show that this problem has no deterministic o(n2)o(n^2)-query (4Ω(1))(4-\Omega(1))-approximation algorithms

    Querying Probabilistic Neighborhoods in Spatial Data Sets Efficiently

    Full text link
    \newcommand{\dist}{\operatorname{dist}} In this paper we define the notion of a probabilistic neighborhood in spatial data: Let a set PP of nn points in Rd\mathbb{R}^d, a query point qRdq \in \mathbb{R}^d, a distance metric \dist, and a monotonically decreasing function f:R+[0,1]f : \mathbb{R}^+ \rightarrow [0,1] be given. Then a point pPp \in P belongs to the probabilistic neighborhood N(q,f)N(q, f) of qq with respect to ff with probability f(\dist(p,q)). We envision applications in facility location, sensor networks, and other scenarios where a connection between two entities becomes less likely with increasing distance. A straightforward query algorithm would determine a probabilistic neighborhood in Θ(nd)\Theta(n\cdot d) time by probing each point in PP. To answer the query in sublinear time for the planar case, we augment a quadtree suitably and design a corresponding query algorithm. Our theoretical analysis shows that -- for certain distributions of planar PP -- our algorithm answers a query in O((N(q,f)+n)logn)O((|N(q,f)| + \sqrt{n})\log n) time with high probability (whp). This matches up to a logarithmic factor the cost induced by quadtree-based algorithms for deterministic queries and is asymptotically faster than the straightforward approach whenever N(q,f)o(n/logn)|N(q,f)| \in o(n / \log n). As practical proofs of concept we use two applications, one in the Euclidean and one in the hyperbolic plane. In particular, our results yield the first generator for random hyperbolic graphs with arbitrary temperatures in subquadratic time. Moreover, our experimental data show the usefulness of our algorithm even if the point distribution is unknown or not uniform: The running time savings over the pairwise probing approach constitute at least one order of magnitude already for a modest number of points and queries.Comment: The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-44543-4_3
    corecore