880 research outputs found
Scaling and entropy in p-median facility location along a line
The p-median problem is a common model for optimal facility location. The
task is to place p facilities (e.g., warehouses or schools) in a
heterogeneously populated space such that the average distance from a person's
home to the nearest facility is minimized. Here we study the special case where
the population lives along a line (e.g., a road or a river). If facilities are
optimally placed, the length of the line segment served by a facility is
inversely proportional to the square root of the population density. This
scaling law is derived analytically and confirmed for concrete numerical
examples of three US Interstate highways and the Mississippi River. If facility
locations are permitted to deviate from the optimum, the number of possible
solutions increases dramatically. Using Monte Carlo simulations, we compute how
scaling is affected by an increase in the average distance to the nearest
facility. We find that the scaling exponents change and are most sensitive near
the optimum facility distribution.Comment: 7 pages, 6 figures, Physical Review E, in pres
Improved Algorithms for Time Decay Streams
In the time-decay model for data streams, elements of an underlying data set arrive sequentially with the recently arrived elements being more important. A common approach for handling large data sets is to maintain a coreset, a succinct summary of the processed data that allows approximate recovery of a predetermined query. We provide a general framework that takes any offline-coreset and gives a time-decay coreset for polynomial time decay functions.
We also consider the exponential time decay model for k-median clustering, where we provide a constant factor approximation algorithm that utilizes the online facility location algorithm. Our algorithm stores O(k log(h Delta)+h) points where h is the half-life of the decay function and Delta is the aspect ratio of the dataset. Our techniques extend to k-means clustering and M-estimators as well
Online Mixed Packing and Covering
In many problems, the inputs arrive over time, and must be dealt with
irrevocably when they arrive. Such problems are online problems. A common
method of solving online problems is to first solve the corresponding linear
program, and then round the fractional solution online to obtain an integral
solution.
We give algorithms for solving linear programs with mixed packing and
covering constraints online. We first consider mixed packing and covering
linear programs, where packing constraints are given offline and covering
constraints are received online. The objective is to minimize the maximum
multiplicative factor by which any packing constraint is violated, while
satisfying the covering constraints. No prior sublinear competitive algorithms
are known for this problem. We give the first such --- a
polylogarithmic-competitive algorithm for solving mixed packing and covering
linear programs online. We also show a nearly tight lower bound.
Our techniques for the upper bound use an exponential penalty function in
conjunction with multiplicative updates. While exponential penalty functions
are used previously to solve linear programs offline approximately, offline
algorithms know the constraints beforehand and can optimize greedily. In
contrast, when constraints arrive online, updates need to be more complex.
We apply our techniques to solve two online fixed-charge problems with
congestion. These problems are motivated by applications in machine scheduling
and facility location. The linear program for these problems is more
complicated than mixed packing and covering, and presents unique challenges. We
show that our techniques combined with a randomized rounding procedure give
polylogarithmic-competitive integral solutions. These problems generalize
online set-cover, for which there is a polylogarithmic lower bound. Hence, our
results are close to tight
A lower bound for metric 1-median selection
Consider the problem of finding a point in an n-point metric space with the
minimum average distance to all points. We show that this problem has no
deterministic -query -approximation algorithms
Querying Probabilistic Neighborhoods in Spatial Data Sets Efficiently
In this paper we define the notion
of a probabilistic neighborhood in spatial data: Let a set of points in
, a query point , a distance metric \dist,
and a monotonically decreasing function be
given. Then a point belongs to the probabilistic neighborhood of with respect to with probability f(\dist(p,q)). We envision
applications in facility location, sensor networks, and other scenarios where a
connection between two entities becomes less likely with increasing distance. A
straightforward query algorithm would determine a probabilistic neighborhood in
time by probing each point in .
To answer the query in sublinear time for the planar case, we augment a
quadtree suitably and design a corresponding query algorithm. Our theoretical
analysis shows that -- for certain distributions of planar -- our algorithm
answers a query in time with high probability
(whp). This matches up to a logarithmic factor the cost induced by
quadtree-based algorithms for deterministic queries and is asymptotically
faster than the straightforward approach whenever .
As practical proofs of concept we use two applications, one in the Euclidean
and one in the hyperbolic plane. In particular, our results yield the first
generator for random hyperbolic graphs with arbitrary temperatures in
subquadratic time. Moreover, our experimental data show the usefulness of our
algorithm even if the point distribution is unknown or not uniform: The running
time savings over the pairwise probing approach constitute at least one order
of magnitude already for a modest number of points and queries.Comment: The final publication is available at Springer via
http://dx.doi.org/10.1007/978-3-319-44543-4_3
- …