Search CORE

1,201 research outputs found

A Generalized Coupon Collector Problem

Author: A. Kevin Tang
Feller
Weiyu Xu
Publication venue: 'Applied Probability Trust'
Publication date: 18/11/2010
Field of study

This paper provides analysis to a generalized version of the coupon collector problem, in which the collector gets

d

distinct coupons each run and she chooses the one that she has the least so far. On the asymptotic case when the number of coupons

n

goes to infinity, we show that on average

\frac{n\log n}{d} + \frac{n}{d}(m-1)\log\log{n}+O(mn)

runs are needed to collect

m

sets of coupons. An efficient exact algorithm is also developed for any finite case to compute the average needed runs exactly. Numerical examples are provided to verify our theoretical predictions.Comment: 20 pages, 6 figures, preprin

arXiv.org e-Print Archive

Crossref

Optimization results for a generalized coupon collector problem

Author: Anceaume Emmanuelle
Busnel Yann
Schulte-Geers Ernst
Sericola Bruno
Publication venue
Publication date: 01/01/2015
Field of study

We study in this paper a generalized coupon collector problem, which consists in analyzing the time needed to collect a given number of distinct coupons that are drawn from a set of coupons with an arbitrary probability distribution. We suppose that a special coupon called the null coupon can be drawn but never belongs to any collection. In this context, we prove that the almost uniform distribution, for which all the non-null coupons have the same drawing probability, is the distribution which stochastically minimizes the time needed to collect a fixed number of distinct coupons. Moreover, we show that in a given closed subset of probability distributions, the distribution with all its entries, but one, equal to the smallest possible value is the one, which stochastically maximizes the time needed to collect a fixed number of distinct coupons. An computer science application shows the utility of these results.Comment: arXiv admin note: text overlap with arXiv:1402.524

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

New results on a generalized coupon collector problem using Markov chains

Author: Anceaume Emmanuelle
Busnel Yann
Sericola Bruno
Publication venue
Publication date: 21/02/2014
Field of study

We study in this paper a generalized coupon collector problem, which consists in determining the distribution and the moments of the time needed to collect a given number of distinct coupons that are drawn from a set of coupons with an arbitrary probability distribution. We suppose that a special coupon called the null coupon can be drawn but never belongs to any collection. In this context, we obtain expressions of the distribution and the moments of this time. We also prove that the almost-uniform distribution, for which all the non-null coupons have the same drawing probability, is the distribution which minimizes the expected time to get a fixed subset of distinct coupons. This optimization result is extended to the complementary distribution of that time when the full collection is considered, proving by the way this well-known conjecture. Finally, we propose a new conjecture which expresses the fact that the almost-uniform distribution should minimize the complementary distribution of the time needed to get any fixed number of distinct coupons.Comment: 14 page

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Rennes 1

Near-Optimal Straggler Mitigation for Distributed Gradient Methods

Author: Avestimehr A. Salman
Kalan Seyed Mohammadreza Mousavi
Li Songze
Soltanolkotabi Mahdi
Publication venue
Publication date: 27/10/2017
Field of study

Modern learning algorithms use gradient descent updates to train inferential models that best explain data. Scaling these approaches to massive data sizes requires proper distributed gradient descent schemes where distributed worker nodes compute partial gradients based on their partial and local data sets, and send the results to a master node where all the computations are aggregated into a full gradient and the learning model is updated. However, a major performance bottleneck that arises is that some of the worker nodes may run slow. These nodes a.k.a. stragglers can significantly slow down computation as the slowest node may dictate the overall computational time. We propose a distributed computing scheme, called Batched Coupon's Collector (BCC) to alleviate the effect of stragglers in gradient methods. We prove that our BCC scheme is robust to a near optimal number of random stragglers. We also empirically demonstrate that our proposed BCC scheme reduces the run-time by up to 85.4% over Amazon EC2 clusters when compared with other straggler mitigation strategies. We also generalize the proposed BCC scheme to minimize the completion time when implementing gradient descent-based algorithms over heterogeneous worker nodes

arXiv.org e-Print Archive

Crossref