600 research outputs found

    Optimization results for a generalized coupon collector problem

    Get PDF
    We study in this paper a generalized coupon collector problem, which consists in analyzing the time needed to collect a given number of distinct coupons that are drawn from a set of coupons with an arbitrary probability distribution. We suppose that a special coupon called the null coupon can be drawn but never belongs to any collection. In this context, we prove that the almost uniform distribution, for which all the non-null coupons have the same drawing probability, is the distribution which stochastically minimizes the time needed to collect a fixed number of distinct coupons. Moreover, we show that in a given closed subset of probability distributions, the distribution with all its entries, but one, equal to the smallest possible value is the one, which stochastically maximizes the time needed to collect a fixed number of distinct coupons. An computer science application shows the utility of these results.Comment: arXiv admin note: text overlap with arXiv:1402.524

    New results on a generalized coupon collector problem using Markov chains

    Get PDF
    We study in this paper a generalized coupon collector problem, which consists in determining the distribution and the moments of the time needed to collect a given number of distinct coupons that are drawn from a set of coupons with an arbitrary probability distribution. We suppose that a special coupon called the null coupon can be drawn but never belongs to any collection. In this context, we obtain expressions of the distribution and the moments of this time. We also prove that the almost-uniform distribution, for which all the non-null coupons have the same drawing probability, is the distribution which minimizes the expected time to get a fixed subset of distinct coupons. This optimization result is extended to the complementary distribution of that time when the full collection is considered, proving by the way this well-known conjecture. Finally, we propose a new conjecture which expresses the fact that the almost-uniform distribution should minimize the complementary distribution of the time needed to get any fixed number of distinct coupons.Comment: 14 page

    Near-Optimal Straggler Mitigation for Distributed Gradient Methods

    Full text link
    Modern learning algorithms use gradient descent updates to train inferential models that best explain data. Scaling these approaches to massive data sizes requires proper distributed gradient descent schemes where distributed worker nodes compute partial gradients based on their partial and local data sets, and send the results to a master node where all the computations are aggregated into a full gradient and the learning model is updated. However, a major performance bottleneck that arises is that some of the worker nodes may run slow. These nodes a.k.a. stragglers can significantly slow down computation as the slowest node may dictate the overall computational time. We propose a distributed computing scheme, called Batched Coupon's Collector (BCC) to alleviate the effect of stragglers in gradient methods. We prove that our BCC scheme is robust to a near optimal number of random stragglers. We also empirically demonstrate that our proposed BCC scheme reduces the run-time by up to 85.4% over Amazon EC2 clusters when compared with other straggler mitigation strategies. We also generalize the proposed BCC scheme to minimize the completion time when implementing gradient descent-based algorithms over heterogeneous worker nodes

    Packing Returning Secretaries

    Full text link
    We study online secretary problems with returns in combinatorial packing domains with nn candidates that arrive sequentially over time in random order. The goal is to accept a feasible packing of candidates of maximum total value. In the first variant, each candidate arrives exactly twice. All 2n2n arrivals occur in random order. We propose a simple 0.5-competitive algorithm that can be combined with arbitrary approximation algorithms for the packing domain, even when the total value of candidates is a subadditive function. For bipartite matching, we obtain an algorithm with competitive ratio at least 0.5721−o(1)0.5721 - o(1) for growing nn, and an algorithm with ratio at least 0.54590.5459 for all n≄1n \ge 1. We extend all algorithms and ratios to k≄2k \ge 2 arrivals per candidate. In the second variant, there is a pool of undecided candidates. In each round, a random candidate from the pool arrives. Upon arrival a candidate can be either decided (accept/reject) or postponed (returned into the pool). We mainly focus on minimizing the expected number of postponements when computing an optimal solution. An expected number of Θ(nlog⁥n)\Theta(n \log n) is always sufficient. For matroids, we show that the expected number can be reduced to O(rlog⁥(n/r))O(r \log (n/r)), where r≀n/2r \le n/2 is the minimum of the ranks of matroid and dual matroid. For bipartite matching, we show a bound of O(rlog⁥n)O(r \log n), where rr is the size of the optimum matching. For general packing, we show a lower bound of Ω(nlog⁥log⁥n)\Omega(n \log \log n), even when the size of the optimum is r=Θ(log⁥n)r = \Theta(\log n).Comment: 23 pages, 5 figure

    Pioneers of Influence Propagation in Social Networks

    Get PDF
    With the growing importance of corporate viral marketing campaigns on online social networks, the interest in studies of influence propagation through networks is higher than ever. In a viral marketing campaign, a firm initially targets a small set of pioneers and hopes that they would influence a sizeable fraction of the population by diffusion of influence through the network. In general, any marketing campaign might fail to go viral in the first try. As such, it would be useful to have some guide to evaluate the effectiveness of the campaign and judge whether it is worthy of further resources, and in case the campaign has potential, how to hit upon a good pioneer who can make the campaign go viral. In this paper, we present a diffusion model developed by enriching the generalized random graph (a.k.a. configuration model) to provide insight into these questions. We offer the intuition behind the results on this model, rigorously proved in Blaszczyszyn & Gaurav(2013), and illustrate them here by taking examples of random networks having prototypical degree distributions - Poisson degree distribution, which is commonly used as a kind of benchmark, and Power Law degree distribution, which is normally used to approximate the real-world networks. On these networks, the members are assumed to have varying attitudes towards propagating the information. We analyze three cases, in particular - (1) Bernoulli transmissions, when a member influences each of its friend with probability p; (2) Node percolation, when a member influences all its friends with probability p and none with probability 1-p; (3) Coupon-collector transmissions, when a member randomly selects one of his friends K times with replacement. We assume that the configuration model is the closest approximation of a large online social network, when the information available about the network is very limited. The key insight offered by this study from a firm's perspective is regarding how to evaluate the effectiveness of a marketing campaign and do cost-benefit analysis by collecting relevant statistical data from the pioneers it selects. The campaign evaluation criterion is informed by the observation that if the parameters of the underlying network and the campaign effectiveness are such that the campaign can indeed reach a significant fraction of the population, then the set of good pioneers also forms a significant fraction of the population. Therefore, in such a case, the firms can even adopt the naive strategy of repeatedly picking and targeting some number of pioneers at random from the population. With this strategy, the probability of them picking a good pioneer will increase geometrically fast with the number of tries

    ALOHA Random Access that Operates as a Rateless Code

    Get PDF
    Various applications of wireless Machine-to-Machine (M2M) communications have rekindled the research interest in random access protocols, suitable to support a large number of connected devices. Slotted ALOHA and its derivatives represent a simple solution for distributed random access in wireless networks. Recently, a framed version of slotted ALOHA gained renewed interest due to the incorporation of successive interference cancellation (SIC) in the scheme, which resulted in substantially higher throughputs. Based on similar principles and inspired by the rateless coding paradigm, a frameless approach for distributed random access in slotted ALOHA framework is described in this paper. The proposed approach shares an operational analogy with rateless coding, expressed both through the user access strategy and the adaptive length of the contention period, with the objective to end the contention when the instantaneous throughput is maximized. The paper presents the related analysis, providing heuristic criteria for terminating the contention period and showing that very high throughputs can be achieved, even for a low number for contending users. The demonstrated results potentially have more direct practical implications compared to the approaches for coded random access that lead to high throughputs only asymptotically.Comment: Revised version submitted to IEEE Transactions on Communication
    • 

    corecore