1,201 research outputs found
A Generalized Coupon Collector Problem
This paper provides analysis to a generalized version of the coupon collector
problem, in which the collector gets distinct coupons each run and she
chooses the one that she has the least so far. On the asymptotic case when the
number of coupons goes to infinity, we show that on average runs are needed to collect sets
of coupons. An efficient exact algorithm is also developed for any finite case
to compute the average needed runs exactly. Numerical examples are provided to
verify our theoretical predictions.Comment: 20 pages, 6 figures, preprin
Optimization results for a generalized coupon collector problem
We study in this paper a generalized coupon collector problem, which consists
in analyzing the time needed to collect a given number of distinct coupons that
are drawn from a set of coupons with an arbitrary probability distribution. We
suppose that a special coupon called the null coupon can be drawn but never
belongs to any collection. In this context, we prove that the almost uniform
distribution, for which all the non-null coupons have the same drawing
probability, is the distribution which stochastically minimizes the time needed
to collect a fixed number of distinct coupons. Moreover, we show that in a
given closed subset of probability distributions, the distribution with all its
entries, but one, equal to the smallest possible value is the one, which
stochastically maximizes the time needed to collect a fixed number of distinct
coupons. An computer science application shows the utility of these results.Comment: arXiv admin note: text overlap with arXiv:1402.524
New results on a generalized coupon collector problem using Markov chains
We study in this paper a generalized coupon collector problem, which consists
in determining the distribution and the moments of the time needed to collect a
given number of distinct coupons that are drawn from a set of coupons with an
arbitrary probability distribution. We suppose that a special coupon called the
null coupon can be drawn but never belongs to any collection. In this context,
we obtain expressions of the distribution and the moments of this time. We also
prove that the almost-uniform distribution, for which all the non-null coupons
have the same drawing probability, is the distribution which minimizes the
expected time to get a fixed subset of distinct coupons. This optimization
result is extended to the complementary distribution of that time when the full
collection is considered, proving by the way this well-known conjecture.
Finally, we propose a new conjecture which expresses the fact that the
almost-uniform distribution should minimize the complementary distribution of
the time needed to get any fixed number of distinct coupons.Comment: 14 page
Near-Optimal Straggler Mitigation for Distributed Gradient Methods
Modern learning algorithms use gradient descent updates to train inferential
models that best explain data. Scaling these approaches to massive data sizes
requires proper distributed gradient descent schemes where distributed worker
nodes compute partial gradients based on their partial and local data sets, and
send the results to a master node where all the computations are aggregated
into a full gradient and the learning model is updated. However, a major
performance bottleneck that arises is that some of the worker nodes may run
slow. These nodes a.k.a. stragglers can significantly slow down computation as
the slowest node may dictate the overall computational time. We propose a
distributed computing scheme, called Batched Coupon's Collector (BCC) to
alleviate the effect of stragglers in gradient methods. We prove that our BCC
scheme is robust to a near optimal number of random stragglers. We also
empirically demonstrate that our proposed BCC scheme reduces the run-time by up
to 85.4% over Amazon EC2 clusters when compared with other straggler mitigation
strategies. We also generalize the proposed BCC scheme to minimize the
completion time when implementing gradient descent-based algorithms over
heterogeneous worker nodes
- âŠ