9,951 research outputs found
Distributed Learning from Interactions in Social Networks
We consider a network scenario in which agents can evaluate each other
according to a score graph that models some interactions. The goal is to design
a distributed protocol, run by the agents, that allows them to learn their
unknown state among a finite set of possible values. We propose a Bayesian
framework in which scores and states are associated to probabilistic events
with unknown parameters and hyperparameters, respectively. We show that each
agent can learn its state by means of a local Bayesian classifier and a
(centralized) Maximum-Likelihood (ML) estimator of parameter-hyperparameter
that combines plain ML and Empirical Bayes approaches. By using tools from
graphical models, which allow us to gain insight on conditional dependencies of
scores and states, we provide a relaxed probabilistic model that ultimately
leads to a parameter-hyperparameter estimator amenable to distributed
computation. To highlight the appropriateness of the proposed relaxation, we
demonstrate the distributed estimators on a social interaction set-up for user
profiling.Comment: This submission is a shorter work (for conference publication) of a
more comprehensive paper, already submitted as arXiv:1706.04081 (under review
for journal publication). In this short submission only one social set-up is
considered and only one of the relaxed estimators is proposed. Moreover, the
exhaustive analysis, carried out in the longer manuscript, is completely
missing in this versio
Non-Vacuous Generalization Bounds at the ImageNet Scale: A PAC-Bayesian Compression Approach
Modern neural networks are highly overparameterized, with capacity to
substantially overfit to training data. Nevertheless, these networks often
generalize well in practice. It has also been observed that trained networks
can often be "compressed" to much smaller representations. The purpose of this
paper is to connect these two empirical observations. Our main technical result
is a generalization bound for compressed networks based on the compressed size.
Combined with off-the-shelf compression algorithms, the bound leads to state of
the art generalization guarantees; in particular, we provide the first
non-vacuous generalization guarantees for realistic architectures applied to
the ImageNet classification problem. As additional evidence connecting
compression and generalization, we show that compressibility of models that
tend to overfit is limited: We establish an absolute limit on expected
compressibility as a function of expected generalization error, where the
expectations are over the random choice of training examples. The bounds are
complemented by empirical results that show an increase in overfitting implies
an increase in the number of bits required to describe a trained network.Comment: 16 pages, 1 figure. Accepted at ICLR 201
A review of domain adaptation without target labels
Domain adaptation has become a prominent problem setting in machine learning
and related fields. This review asks the question: how can a classifier learn
from a source domain and generalize to a target domain? We present a
categorization of approaches, divided into, what we refer to as, sample-based,
feature-based and inference-based methods. Sample-based methods focus on
weighting individual observations during training based on their importance to
the target domain. Feature-based methods revolve around on mapping, projecting
and representing features such that a source classifier performs well on the
target domain and inference-based methods incorporate adaptation into the
parameter estimation procedure, for instance through constraints on the
optimization procedure. Additionally, we review a number of conditions that
allow for formulating bounds on the cross-domain generalization error. Our
categorization highlights recurring ideas and raises questions important to
further research.Comment: 20 pages, 5 figure
Mining gold from implicit models to improve likelihood-free inference
Simulators often provide the best description of real-world phenomena.
However, they also lead to challenging inverse problems because the density
they implicitly define is often intractable. We present a new suite of
simulation-based inference techniques that go beyond the traditional
Approximate Bayesian Computation approach, which struggles in a
high-dimensional setting, and extend methods that use surrogate models based on
neural networks. We show that additional information, such as the joint
likelihood ratio and the joint score, can often be extracted from simulators
and used to augment the training data for these surrogate models. Finally, we
demonstrate that these new techniques are more sample efficient and provide
higher-fidelity inference than traditional methods.Comment: Code available at
https://github.com/johannbrehmer/simulator-mining-example . v2: Fixed typos.
v3: Expanded discussion, added Lotka-Volterra example. v4: Improved clarit
- …