376,596 research outputs found

    The Online Median Problem

    Get PDF
    We introduce a natural variant of the (metric uncapacitated) k-median problem that we call the online median problem. Whereas the k-median problem involves optimizing the simultaneous placement of k facilities, the online median problem imposes the following additional constraints: the facilities are placed one at a time, a facility cannot be moved once it is placed, and the total number of facilities to be placed, k, is not known in advance. The objective of an online median algorithm is to minimize the competitive ratio, that is, the worst-case ratio of the cost of an online placement to that of an optimal offline placement. Our main result is a constant-competitive algorithm for the online median problem running in time that is linear in the input size. In addition, we present a related, though substantially simpler, constant-factor approximation algorithm for the (metric uncapacitated) facility location problem that runs in time linear in the input size. The latter algorithm is similar in spirit to the recent primal-dual-based facility location algorithm of Jain and Vazirani, but our approach is more elementary and yields an improved running time. While our primary focus is on problems which ask us to minimize the weighted average service distance to facilities, we also show that our results can be generalized to hold, to within constant factors, for more general objective functions. For example, we show that all of our approximation results hold, to within constant factors, for the k-means objective function

    Incremental Medians via Online Bidding

    Full text link
    In the k-median problem we are given sets of facilities and customers, and distances between them. For a given set F of facilities, the cost of serving a customer u is the minimum distance between u and a facility in F. The goal is to find a set F of k facilities that minimizes the sum, over all customers, of their service costs. Following Mettu and Plaxton, we study the incremental medians problem, where k is not known in advance, and the algorithm produces a nested sequence of facility sets where the kth set has size k. The algorithm is c-cost-competitive if the cost of each set is at most c times the cost of the optimum set of size k. We give improved incremental algorithms for the metric version: an 8-cost-competitive deterministic algorithm, a 2e ~ 5.44-cost-competitive randomized algorithm, a (24+epsilon)-cost-competitive, poly-time deterministic algorithm, and a (6e+epsilon ~ .31)-cost-competitive, poly-time randomized algorithm. The algorithm is s-size-competitive if the cost of the kth set is at most the minimum cost of any set of size k, and has size at most s k. The optimal size-competitive ratios for this problem are 4 (deterministic) and e (randomized). We present the first poly-time O(log m)-size-approximation algorithm for the offline problem and first poly-time O(log m)-size-competitive algorithm for the incremental problem. Our proofs reduce incremental medians to the following online bidding problem: faced with an unknown threshold T, an algorithm submits "bids" until it submits a bid that is at least the threshold. It pays the sum of all its bids. We prove that folklore algorithms for online bidding are optimally competitive.Comment: conference version appeared in LATIN 2006 as "Oblivious Medians via Online Bidding

    Online kk-Median with Consistent Clusters

    Full text link
    We consider the online kk-median clustering problem in which nn points arrive online and must be irrevocably assigned to a cluster on arrival. As there are lower bound instances that show that an online algorithm cannot achieve a competitive ratio that is a function of nn and kk, we consider a beyond worst-case analysis model in which the algorithm is provided a priori with a predicted budget BB that upper bounds the optimal objective value. We give an algorithm that achieves a competitive ratio that is exponential in the the number kk of clusters, and show that the competitive ratio of every algorithm must be linear in kk. To the best of our knowledge this is the first investigation in the literature that considers cluster consistency using competitive analysis.Comment: 28 pages, 7 figure

    An Efficient Bandit Algorithm for Realtime Multivariate Optimization

    Full text link
    Optimization is commonly employed to determine the content of web pages, such as to maximize conversions on landing pages or click-through rates on search engine result pages. Often the layout of these pages can be decoupled into several separate decisions. For example, the composition of a landing page may involve deciding which image to show, which wording to use, what color background to display, etc. Such optimization is a combinatorial problem over an exponentially large decision space. Randomized experiments do not scale well to this setting, and therefore, in practice, one is typically limited to optimizing a single aspect of a web page at a time. This represents a missed opportunity in both the speed of experimentation and the exploitation of possible interactions between layout decisions. Here we focus on multivariate optimization of interactive web pages. We formulate an approach where the possible interactions between different components of the page are modeled explicitly. We apply bandit methodology to explore the layout space efficiently and use hill-climbing to select optimal content in realtime. Our algorithm also extends to contextualization and personalization of layout selection. Simulation results show the suitability of our approach to large decision spaces with strong interactions between content. We further apply our algorithm to optimize a message that promotes adoption of an Amazon service. After only a single week of online optimization, we saw a 21% conversion increase compared to the median layout. Our technique is currently being deployed to optimize content across several locations at Amazon.com.Comment: KDD'17 Audience Appreciation Awar

    How Many Dissimilarity/Kernel Self Organizing Map Variants Do We Need?

    Full text link
    In numerous applicative contexts, data are too rich and too complex to be represented by numerical vectors. A general approach to extend machine learning and data mining techniques to such data is to really on a dissimilarity or on a kernel that measures how different or similar two objects are. This approach has been used to define several variants of the Self Organizing Map (SOM). This paper reviews those variants in using a common set of notations in order to outline differences and similarities between them. It discusses the advantages and drawbacks of the variants, as well as the actual relevance of the dissimilarity/kernel SOM for practical applications
    • …
    corecore