42,541 research outputs found
Unimodal Thompson Sampling for Graph-Structured Arms
We study, to the best of our knowledge, the first Bayesian algorithm for
unimodal Multi-Armed Bandit (MAB) problems with graph structure. In this
setting, each arm corresponds to a node of a graph and each edge provides a
relationship, unknown to the learner, between two nodes in terms of expected
reward. Furthermore, for any node of the graph there is a path leading to the
unique node providing the maximum expected reward, along which the expected
reward is monotonically increasing. Previous results on this setting describe
the behavior of frequentist MAB algorithms. In our paper, we design a Thompson
Sampling-based algorithm whose asymptotic pseudo-regret matches the lower bound
for the considered setting. We show that -as it happens in a wide number of
scenarios- Bayesian MAB algorithms dramatically outperform frequentist ones. In
particular, we provide a thorough experimental evaluation of the performance of
our and state-of-the-art algorithms as the properties of the graph vary
Evaluating Overfit and Underfit in Models of Network Community Structure
A common data mining task on networks is community detection, which seeks an
unsupervised decomposition of a network into structural groups based on
statistical regularities in the network's connectivity. Although many methods
exist, the No Free Lunch theorem for community detection implies that each
makes some kind of tradeoff, and no algorithm can be optimal on all inputs.
Thus, different algorithms will over or underfit on different inputs, finding
more, fewer, or just different communities than is optimal, and evaluation
methods that use a metadata partition as a ground truth will produce misleading
conclusions about general accuracy. Here, we present a broad evaluation of over
and underfitting in community detection, comparing the behavior of 16
state-of-the-art community detection algorithms on a novel and structurally
diverse corpus of 406 real-world networks. We find that (i) algorithms vary
widely both in the number of communities they find and in their corresponding
composition, given the same input, (ii) algorithms can be clustered into
distinct high-level groups based on similarities of their outputs on real-world
networks, and (iii) these differences induce wide variation in accuracy on link
prediction and link description tasks. We introduce a new diagnostic for
evaluating overfitting and underfitting in practice, and use it to roughly
divide community detection methods into general and specialized learning
algorithms. Across methods and inputs, Bayesian techniques based on the
stochastic block model and a minimum description length approach to
regularization represent the best general learning approach, but can be
outperformed under specific circumstances. These results introduce both a
theoretically principled approach to evaluate over and underfitting in models
of network community structure and a realistic benchmark by which new methods
may be evaluated and compared.Comment: 22 pages, 13 figures, 3 table
- …