78,194 research outputs found

    RRR: Rank-Regret Representative

    Full text link
    Selecting the best items in a dataset is a common task in data exploration. However, the concept of "best" lies in the eyes of the beholder: different users may consider different attributes more important, and hence arrive at different rankings. Nevertheless, one can remove "dominated" items and create a "representative" subset of the data set, comprising the "best items" in it. A Pareto-optimal representative is guaranteed to contain the best item of each possible ranking, but it can be almost as big as the full data. Representative can be found if we relax the requirement to include the best item for every possible user, and instead just limit the users' "regret". Existing work defines regret as the loss in score by limiting consideration to the representative instead of the full data set, for any chosen ranking function. However, the score is often not a meaningful number and users may not understand its absolute value. Sometimes small ranges in score can include large fractions of the data set. In contrast, users do understand the notion of rank ordering. Therefore, alternatively, we consider the position of the items in the ranked list for defining the regret and propose the {\em rank-regret representative} as the minimal subset of the data containing at least one of the top-kk of any possible ranking function. This problem is NP-complete. We use the geometric interpretation of items to bound their ranks on ranges of functions and to utilize combinatorial geometry notions for developing effective and efficient approximation algorithms for the problem. Experiments on real datasets demonstrate that we can efficiently find small subsets with small rank-regrets

    Stackelberg Network Pricing Games

    Get PDF
    We study a multi-player one-round game termed Stackelberg Network Pricing Game, in which a leader can set prices for a subset of mm priceable edges in a graph. The other edges have a fixed cost. Based on the leader's decision one or more followers optimize a polynomial-time solvable combinatorial minimization problem and choose a minimum cost solution satisfying their requirements based on the fixed costs and the leader's prices. The leader receives as revenue the total amount of prices paid by the followers for priceable edges in their solutions, and the problem is to find revenue maximizing prices. Our model extends several known pricing problems, including single-minded and unit-demand pricing, as well as Stackelberg pricing for certain follower problems like shortest path or minimum spanning tree. Our first main result is a tight analysis of a single-price algorithm for the single follower game, which provides a (1+ϵ)logm(1+\epsilon) \log m-approximation for any ϵ>0\epsilon >0. This can be extended to provide a (1+ϵ)(logk+logm)(1+\epsilon)(\log k + \log m)-approximation for the general problem and kk followers. The latter result is essentially best possible, as the problem is shown to be hard to approximate within \mathcal{O(\log^\epsilon k + \log^\epsilon m). If followers have demands, the single-price algorithm provides a (1+ϵ)m2(1+\epsilon)m^2-approximation, and the problem is hard to approximate within \mathcal{O(m^\epsilon) for some ϵ>0\epsilon >0. Our second main result is a polynomial time algorithm for revenue maximization in the special case of Stackelberg bipartite vertex cover, which is based on non-trivial max-flow and LP-duality techniques. Our results can be extended to provide constant-factor approximations for any constant number of followers

    The Geometry of Scheduling

    Full text link
    We consider the following general scheduling problem: The input consists of n jobs, each with an arbitrary release time, size, and a monotone function specifying the cost incurred when the job is completed at a particular time. The objective is to find a preemptive schedule of minimum aggregate cost. This problem formulation is general enough to include many natural scheduling objectives, such as weighted flow, weighted tardiness, and sum of flow squared. Our main result is a randomized polynomial-time algorithm with an approximation ratio O(log log nP), where P is the maximum job size. We also give an O(1) approximation in the special case when all jobs have identical release times. The main idea is to reduce this scheduling problem to a particular geometric set-cover problem which is then solved using the local ratio technique and Varadarajan's quasi-uniform sampling technique. This general algorithmic approach improves the best known approximation ratios by at least an exponential factor (and much more in some cases) for essentially all of the nontrivial common special cases of this problem. Our geometric interpretation of scheduling may be of independent interest.Comment: Conference version in FOCS 201

    A Constant-Factor Approximation for Multi-Covering with Disks

    Full text link
    We consider variants of the following multi-covering problem with disks. We are given two point sets YY (servers) and XX (clients) in the plane, a coverage function κ:XN\kappa :X \rightarrow \mathcal{N}, and a constant α1\alpha \geq 1. Centered at each server is a single disk whose radius we are free to set. The requirement is that each client xXx \in X be covered by at least κ(x)\kappa(x) of the server disks. The objective function we wish to minimize is the sum of the α\alpha-th powers of the disk radii. We present a polynomial time algorithm for this problem achieving an O(1)O(1) approximation

    Message and time efficient multi-broadcast schemes

    Full text link
    We consider message and time efficient broadcasting and multi-broadcasting in wireless ad-hoc networks, where a subset of nodes, each with a unique rumor, wish to broadcast their rumors to all destinations while minimizing the total number of transmissions and total time until all rumors arrive to their destination. Under centralized settings, we introduce a novel approximation algorithm that provides almost optimal results with respect to the number of transmissions and total time, separately. Later on, we show how to efficiently implement this algorithm under distributed settings, where the nodes have only local information about their surroundings. In addition, we show multiple approximation techniques based on the network collision detection capabilities and explain how to calibrate the algorithms' parameters to produce optimal results for time and messages.Comment: In Proceedings FOMC 2013, arXiv:1310.459
    corecore