600 research outputs found
Optimization results for a generalized coupon collector problem
We study in this paper a generalized coupon collector problem, which consists
in analyzing the time needed to collect a given number of distinct coupons that
are drawn from a set of coupons with an arbitrary probability distribution. We
suppose that a special coupon called the null coupon can be drawn but never
belongs to any collection. In this context, we prove that the almost uniform
distribution, for which all the non-null coupons have the same drawing
probability, is the distribution which stochastically minimizes the time needed
to collect a fixed number of distinct coupons. Moreover, we show that in a
given closed subset of probability distributions, the distribution with all its
entries, but one, equal to the smallest possible value is the one, which
stochastically maximizes the time needed to collect a fixed number of distinct
coupons. An computer science application shows the utility of these results.Comment: arXiv admin note: text overlap with arXiv:1402.524
New results on a generalized coupon collector problem using Markov chains
We study in this paper a generalized coupon collector problem, which consists
in determining the distribution and the moments of the time needed to collect a
given number of distinct coupons that are drawn from a set of coupons with an
arbitrary probability distribution. We suppose that a special coupon called the
null coupon can be drawn but never belongs to any collection. In this context,
we obtain expressions of the distribution and the moments of this time. We also
prove that the almost-uniform distribution, for which all the non-null coupons
have the same drawing probability, is the distribution which minimizes the
expected time to get a fixed subset of distinct coupons. This optimization
result is extended to the complementary distribution of that time when the full
collection is considered, proving by the way this well-known conjecture.
Finally, we propose a new conjecture which expresses the fact that the
almost-uniform distribution should minimize the complementary distribution of
the time needed to get any fixed number of distinct coupons.Comment: 14 page
Near-Optimal Straggler Mitigation for Distributed Gradient Methods
Modern learning algorithms use gradient descent updates to train inferential
models that best explain data. Scaling these approaches to massive data sizes
requires proper distributed gradient descent schemes where distributed worker
nodes compute partial gradients based on their partial and local data sets, and
send the results to a master node where all the computations are aggregated
into a full gradient and the learning model is updated. However, a major
performance bottleneck that arises is that some of the worker nodes may run
slow. These nodes a.k.a. stragglers can significantly slow down computation as
the slowest node may dictate the overall computational time. We propose a
distributed computing scheme, called Batched Coupon's Collector (BCC) to
alleviate the effect of stragglers in gradient methods. We prove that our BCC
scheme is robust to a near optimal number of random stragglers. We also
empirically demonstrate that our proposed BCC scheme reduces the run-time by up
to 85.4% over Amazon EC2 clusters when compared with other straggler mitigation
strategies. We also generalize the proposed BCC scheme to minimize the
completion time when implementing gradient descent-based algorithms over
heterogeneous worker nodes
Packing Returning Secretaries
We study online secretary problems with returns in combinatorial packing
domains with candidates that arrive sequentially over time in random order.
The goal is to accept a feasible packing of candidates of maximum total value.
In the first variant, each candidate arrives exactly twice. All arrivals
occur in random order. We propose a simple 0.5-competitive algorithm that can
be combined with arbitrary approximation algorithms for the packing domain,
even when the total value of candidates is a subadditive function. For
bipartite matching, we obtain an algorithm with competitive ratio at least
for growing , and an algorithm with ratio at least
for all . We extend all algorithms and ratios to arrivals
per candidate.
In the second variant, there is a pool of undecided candidates. In each
round, a random candidate from the pool arrives. Upon arrival a candidate can
be either decided (accept/reject) or postponed (returned into the pool). We
mainly focus on minimizing the expected number of postponements when computing
an optimal solution. An expected number of is always
sufficient. For matroids, we show that the expected number can be reduced to
, where is the minimum of the ranks of matroid and
dual matroid. For bipartite matching, we show a bound of , where
is the size of the optimum matching. For general packing, we show a lower
bound of , even when the size of the optimum is .Comment: 23 pages, 5 figure
Pioneers of Influence Propagation in Social Networks
With the growing importance of corporate viral marketing campaigns on online
social networks, the interest in studies of influence propagation through
networks is higher than ever. In a viral marketing campaign, a firm initially
targets a small set of pioneers and hopes that they would influence a sizeable
fraction of the population by diffusion of influence through the network. In
general, any marketing campaign might fail to go viral in the first try. As
such, it would be useful to have some guide to evaluate the effectiveness of
the campaign and judge whether it is worthy of further resources, and in case
the campaign has potential, how to hit upon a good pioneer who can make the
campaign go viral. In this paper, we present a diffusion model developed by
enriching the generalized random graph (a.k.a. configuration model) to provide
insight into these questions. We offer the intuition behind the results on this
model, rigorously proved in Blaszczyszyn & Gaurav(2013), and illustrate them
here by taking examples of random networks having prototypical degree
distributions - Poisson degree distribution, which is commonly used as a kind
of benchmark, and Power Law degree distribution, which is normally used to
approximate the real-world networks. On these networks, the members are assumed
to have varying attitudes towards propagating the information. We analyze three
cases, in particular - (1) Bernoulli transmissions, when a member influences
each of its friend with probability p; (2) Node percolation, when a member
influences all its friends with probability p and none with probability 1-p;
(3) Coupon-collector transmissions, when a member randomly selects one of his
friends K times with replacement. We assume that the configuration model is the
closest approximation of a large online social network, when the information
available about the network is very limited. The key insight offered by this
study from a firm's perspective is regarding how to evaluate the effectiveness
of a marketing campaign and do cost-benefit analysis by collecting relevant
statistical data from the pioneers it selects. The campaign evaluation
criterion is informed by the observation that if the parameters of the
underlying network and the campaign effectiveness are such that the campaign
can indeed reach a significant fraction of the population, then the set of good
pioneers also forms a significant fraction of the population. Therefore, in
such a case, the firms can even adopt the naive strategy of repeatedly picking
and targeting some number of pioneers at random from the population. With this
strategy, the probability of them picking a good pioneer will increase
geometrically fast with the number of tries
ALOHA Random Access that Operates as a Rateless Code
Various applications of wireless Machine-to-Machine (M2M) communications have
rekindled the research interest in random access protocols, suitable to support
a large number of connected devices. Slotted ALOHA and its derivatives
represent a simple solution for distributed random access in wireless networks.
Recently, a framed version of slotted ALOHA gained renewed interest due to the
incorporation of successive interference cancellation (SIC) in the scheme,
which resulted in substantially higher throughputs. Based on similar principles
and inspired by the rateless coding paradigm, a frameless approach for
distributed random access in slotted ALOHA framework is described in this
paper. The proposed approach shares an operational analogy with rateless
coding, expressed both through the user access strategy and the adaptive length
of the contention period, with the objective to end the contention when the
instantaneous throughput is maximized. The paper presents the related analysis,
providing heuristic criteria for terminating the contention period and showing
that very high throughputs can be achieved, even for a low number for
contending users. The demonstrated results potentially have more direct
practical implications compared to the approaches for coded random access that
lead to high throughputs only asymptotically.Comment: Revised version submitted to IEEE Transactions on Communication
- âŠ