22,900 research outputs found
Incremental Medians via Online Bidding
In the k-median problem we are given sets of facilities and customers, and
distances between them. For a given set F of facilities, the cost of serving a
customer u is the minimum distance between u and a facility in F. The goal is
to find a set F of k facilities that minimizes the sum, over all customers, of
their service costs.
Following Mettu and Plaxton, we study the incremental medians problem, where
k is not known in advance, and the algorithm produces a nested sequence of
facility sets where the kth set has size k. The algorithm is c-cost-competitive
if the cost of each set is at most c times the cost of the optimum set of size
k. We give improved incremental algorithms for the metric version: an
8-cost-competitive deterministic algorithm, a 2e ~ 5.44-cost-competitive
randomized algorithm, a (24+epsilon)-cost-competitive, poly-time deterministic
algorithm, and a (6e+epsilon ~ .31)-cost-competitive, poly-time randomized
algorithm.
The algorithm is s-size-competitive if the cost of the kth set is at most the
minimum cost of any set of size k, and has size at most s k. The optimal
size-competitive ratios for this problem are 4 (deterministic) and e
(randomized). We present the first poly-time O(log m)-size-approximation
algorithm for the offline problem and first poly-time O(log m)-size-competitive
algorithm for the incremental problem.
Our proofs reduce incremental medians to the following online bidding
problem: faced with an unknown threshold T, an algorithm submits "bids" until
it submits a bid that is at least the threshold. It pays the sum of all its
bids. We prove that folklore algorithms for online bidding are optimally
competitive.Comment: conference version appeared in LATIN 2006 as "Oblivious Medians via
Online Bidding
General Bounds for Incremental Maximization
We propose a theoretical framework to capture incremental solutions to
cardinality constrained maximization problems. The defining characteristic of
our framework is that the cardinality/support of the solution is bounded by a
value that grows over time, and we allow the solution to be
extended one element at a time. We investigate the best-possible competitive
ratio of such an incremental solution, i.e., the worst ratio over all
between the incremental solution after steps and an optimum solution of
cardinality . We define a large class of problems that contains many
important cardinality constrained maximization problems like maximum matching,
knapsack, and packing/covering problems. We provide a general
-competitive incremental algorithm for this class of problems, and show
that no algorithm can have competitive ratio below in general.
In the second part of the paper, we focus on the inherently incremental
greedy algorithm that increases the objective value as much as possible in each
step. This algorithm is known to be -competitive for submodular objective
functions, but it has unbounded competitive ratio for the class of incremental
problems mentioned above. We define a relaxed submodularity condition for the
objective function, capturing problems like maximum (weighted) (-)matching
and a variant of the maximum flow problem. We show that the greedy algorithm
has competitive ratio (exactly) for the class of problems that satisfy
this relaxed submodularity condition.
Note that our upper bounds on the competitive ratios translate to
approximation ratios for the underlying cardinality constrained problems.Comment: fixed typo
Location models in the public sector
The past four decades have witnessed an explosive growth in the field of networkbased facility location modeling. This is not at all surprising since location policy is one of the most profitable areas of applied systems analysis in regional science and ample theoretical and applied challenges are offered. Location-allocation models seek the location of facilities and/or services (e.g., schools, hospitals, and warehouses) so as to optimize one or several objectives generally related to the efficiency of the system or to the allocation of resources. This paper concerns the location of facilities or services in discrete space or networks, that are related to the public sector, such as emergency services (ambulances, fire stations, and police units), school systems and postal facilities. The paper is structured as follows: first, we will focus on public facility location models that use some type of coverage criterion, with special emphasis in emergency services. The second section will examine models based on the P-Median problem and some of the issues faced by planners when implementing this formulation in real world locational decisions. Finally, the last section will examine new trends in public sector facility location modeling.Location analysis, public facilities, covering models
Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm
Over the past five decades, k-means has become the clustering algorithm of
choice in many application domains primarily due to its simplicity, time/space
efficiency, and invariance to the ordering of the data points. Unfortunately,
the algorithm's sensitivity to the initial selection of the cluster centers
remains to be its most serious drawback. Numerous initialization methods have
been proposed to address this drawback. Many of these methods, however, have
time complexity superlinear in the number of data points, which makes them
impractical for large data sets. On the other hand, linear methods are often
random and/or sensitive to the ordering of the data points. These methods are
generally unreliable in that the quality of their results is unpredictable.
Therefore, it is common practice to perform multiple runs of such methods and
take the output of the run that produces the best results. Such a practice,
however, greatly increases the computational requirements of the otherwise
highly efficient k-means algorithm. In this chapter, we investigate the
empirical performance of six linear, deterministic (non-random), and
order-invariant k-means initialization methods on a large and diverse
collection of data sets from the UCI Machine Learning Repository. The results
demonstrate that two relatively unknown hierarchical initialization methods due
to Su and Dy outperform the remaining four methods with respect to two
objective effectiveness criteria. In addition, a recent method due to Erisoglu
et al. performs surprisingly poorly.Comment: 21 pages, 2 figures, 5 tables, Partitional Clustering Algorithms
(Springer, 2014). arXiv admin note: substantial text overlap with
arXiv:1304.7465, arXiv:1209.196
Combining Multiple Clusterings via Crowd Agreement Estimation and Multi-Granularity Link Analysis
The clustering ensemble technique aims to combine multiple clusterings into a
probably better and more robust clustering and has been receiving an increasing
attention in recent years. There are mainly two aspects of limitations in the
existing clustering ensemble approaches. Firstly, many approaches lack the
ability to weight the base clusterings without access to the original data and
can be affected significantly by the low-quality, or even ill clusterings.
Secondly, they generally focus on the instance level or cluster level in the
ensemble system and fail to integrate multi-granularity cues into a unified
model. To address these two limitations, this paper proposes to solve the
clustering ensemble problem via crowd agreement estimation and
multi-granularity link analysis. We present the normalized crowd agreement
index (NCAI) to evaluate the quality of base clusterings in an unsupervised
manner and thus weight the base clusterings in accordance with their clustering
validity. To explore the relationship between clusters, the source aware
connected triple (SACT) similarity is introduced with regard to their common
neighbors and the source reliability. Based on NCAI and multi-granularity
information collected among base clusterings, clusters, and data instances, we
further propose two novel consensus functions, termed weighted evidence
accumulation clustering (WEAC) and graph partitioning with multi-granularity
link analysis (GP-MGLA) respectively. The experiments are conducted on eight
real-world datasets. The experimental results demonstrate the effectiveness and
robustness of the proposed methods.Comment: The MATLAB source code of this work is available at:
https://www.researchgate.net/publication/28197031
Incorporating waiting time in competitive location models: Formulations and heuristics
In this paper we propose a metaheuristic to solve a new version of the Maximum Capture Problem. In the original MCP, market capture is obtained by lower traveling distances or lower traveling time, in this new version not only the traveling time but also the waiting time will affect the market share. This problem is hard to solve using standard optimization techniques. Metaheuristics are shown to offer accurate results within acceptable computing times.Market capture, queuing, ant colony optimization
- âŠ