23,245 research outputs found
Incremental Medians via Online Bidding
In the k-median problem we are given sets of facilities and customers, and
distances between them. For a given set F of facilities, the cost of serving a
customer u is the minimum distance between u and a facility in F. The goal is
to find a set F of k facilities that minimizes the sum, over all customers, of
their service costs.
Following Mettu and Plaxton, we study the incremental medians problem, where
k is not known in advance, and the algorithm produces a nested sequence of
facility sets where the kth set has size k. The algorithm is c-cost-competitive
if the cost of each set is at most c times the cost of the optimum set of size
k. We give improved incremental algorithms for the metric version: an
8-cost-competitive deterministic algorithm, a 2e ~ 5.44-cost-competitive
randomized algorithm, a (24+epsilon)-cost-competitive, poly-time deterministic
algorithm, and a (6e+epsilon ~ .31)-cost-competitive, poly-time randomized
algorithm.
The algorithm is s-size-competitive if the cost of the kth set is at most the
minimum cost of any set of size k, and has size at most s k. The optimal
size-competitive ratios for this problem are 4 (deterministic) and e
(randomized). We present the first poly-time O(log m)-size-approximation
algorithm for the offline problem and first poly-time O(log m)-size-competitive
algorithm for the incremental problem.
Our proofs reduce incremental medians to the following online bidding
problem: faced with an unknown threshold T, an algorithm submits "bids" until
it submits a bid that is at least the threshold. It pays the sum of all its
bids. We prove that folklore algorithms for online bidding are optimally
competitive.Comment: conference version appeared in LATIN 2006 as "Oblivious Medians via
Online Bidding
Single equation endogenous binary response models
This paper studies single equation models for binary outcomes incorporating instrumental variable restrictions. The models are incomplete in the sense that they place no restriction on the way in which values of endogenous variables are generated. The models are set, not point, identifying. The paper explores the nature of set identification in single equation IV models in which the binary outcome is determined by a threshold crossing condition. There is special attention to models which require the threshold crossing function to be a monotone function of a linear index involving observable endogenous and exogenous explanatory variables. Identified sets can be large unless instrumental variables have substantial predictive power. A generic feature of the identified sets is that they are not connected when instruments are weak. The results suggest that the strong point identifying power of triangular "control function" models - restricted versions of the IV models considered here - is fragile, the wide expanses of the IV model's identified set awaiting in the event of failure of the triangular model's restrictions
Consensus Strategies for Signed Profiles on Graphs
The median problem is a classical problem in Location Theory: one searches for a location that minimizes the average distance to the sites of the clients. This is for desired facilities as a distribution center for a set of warehouses. More recently, for obnoxious facilities, the antimedian was studied. Here one maximizes the average distance to the clients. In this paper the mixed case is studied. Clients are represented by a profile, which is a sequence of vertices with repetitions allowed. In a signed profile each element is provided with a sign from {+,-}. Thus one can take into account whether the client prefers the facility (with a + sign) or rejects it (with a - sign). The graphs for which all median sets, or all antimedian sets, are connected are characterized. Various consensus strategies for signed profiles are studied, amongst which Majority, Plurality and Scarcity. Hypercubes are the only graphs on which Majority produces the median set for all signed profiles. Finally, the antimedian sets are found by the Scarcity Strategy on e.g. Hamming graphs, Johnson graphs and halfcubes.median;consensus function;median graph;majority rule;plurality strategy;Graph theory;Hamming graph;Johnson graph;halfcube;scarcity strategy;Discrete location and assignment;Distance in graphs
Asymptotic equivalence and adaptive estimation for robust nonparametric regression
Asymptotic equivalence theory developed in the literature so far are only for
bounded loss functions. This limits the potential applications of the theory
because many commonly used loss functions in statistical inference are
unbounded. In this paper we develop asymptotic equivalence results for robust
nonparametric regression with unbounded loss functions. The results imply that
all the Gaussian nonparametric regression procedures can be robustified in a
unified way. A key step in our equivalence argument is to bin the data and then
take the median of each bin. The asymptotic equivalence results have
significant practical implications. To illustrate the general principles of the
equivalence argument we consider two important nonparametric inference
problems: robust estimation of the regression function and the estimation of a
quadratic functional. In both cases easily implementable procedures are
constructed and are shown to enjoy simultaneously a high degree of robustness
and adaptivity. Other problems such as construction of confidence sets and
nonparametric hypothesis testing can be handled in a similar fashion.Comment: Published in at http://dx.doi.org/10.1214/08-AOS681 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Computing medians and means in Hadamard spaces
The geometric median as well as the Frechet mean of points in an Hadamard
space are important in both theory and applications. Surprisingly, no
algorithms for their computation are hitherto known. To address this issue, we
use a split version of the proximal point algorithm for minimizing a sum of
convex functions and prove that this algorithm produces a sequence converging
to a minimizer of the objective function, which extends a recent result of D.
Bertsekas (2001) into Hadamard spaces. The method is quite robust and not only
does it yield algorithms for the median and the mean, but it also applies to
various other optimization problems. We moreover show that another algorithm
for computing the Frechet mean can be derived from the law of large numbers due
to K.-T. Sturm (2002). In applications, computing medians and means is probably
most needed in tree space, which is an instance of an Hadamard space, invented
by Billera, Holmes, and Vogtmann (2001) as a tool for averaging phylogenetic
trees. It turns out, however, that it can be also used to model numerous other
tree-like structures. Since there now exists a polynomial-time algorithm for
computing geodesics in tree space due to M. Owen and S. Provan (2011), we
obtain efficient algorithms for computing medians and means, which can be
directly used in practice.Comment: Corrected version. Accepted in SIAM Journal on Optimizatio
- …