27,722 research outputs found
Optimal Belief Approximation
In Bayesian statistics probability distributions express beliefs. However,
for many problems the beliefs cannot be computed analytically and
approximations of beliefs are needed. We seek a loss function that quantifies
how "embarrassing" it is to communicate a given approximation. We reproduce and
discuss an old proof showing that there is only one ranking under the
requirements that (1) the best ranked approximation is the non-approximated
belief and (2) that the ranking judges approximations only by their predictions
for actual outcomes. The loss function that is obtained in the derivation is
equal to the Kullback-Leibler divergence when normalized. This loss function is
frequently used in the literature. However, there seems to be confusion about
the correct order in which its functional arguments, the approximated and
non-approximated beliefs, should be used. The correct order ensures that the
recipient of a communication is only deprived of the minimal amount of
information. We hope that the elementary derivation settles the apparent
confusion. For example when approximating beliefs with Gaussian distributions
the optimal approximation is given by moment matching. This is in contrast to
many suggested computational schemes.Comment: made improvements on the proof and the languag
Learning Fair Naive Bayes Classifiers by Discovering and Eliminating Discrimination Patterns
As machine learning is increasingly used to make real-world decisions, recent
research efforts aim to define and ensure fairness in algorithmic decision
making. Existing methods often assume a fixed set of observable features to
define individuals, but lack a discussion of certain features not being
observed at test time. In this paper, we study fairness of naive Bayes
classifiers, which allow partial observations. In particular, we introduce the
notion of a discrimination pattern, which refers to an individual receiving
different classifications depending on whether some sensitive attributes were
observed. Then a model is considered fair if it has no such pattern. We propose
an algorithm to discover and mine for discrimination patterns in a naive Bayes
classifier, and show how to learn maximum likelihood parameters subject to
these fairness constraints. Our approach iteratively discovers and eliminates
discrimination patterns until a fair model is learned. An empirical evaluation
on three real-world datasets demonstrates that we can remove exponentially many
discrimination patterns by only adding a small fraction of them as constraints
An Analysis of the Value of Information when Exploring Stochastic, Discrete Multi-Armed Bandits
In this paper, we propose an information-theoretic exploration strategy for
stochastic, discrete multi-armed bandits that achieves optimal regret. Our
strategy is based on the value of information criterion. This criterion
measures the trade-off between policy information and obtainable rewards. High
amounts of policy information are associated with exploration-dominant searches
of the space and yield high rewards. Low amounts of policy information favor
the exploitation of existing knowledge. Information, in this criterion, is
quantified by a parameter that can be varied during search. We demonstrate that
a simulated-annealing-like update of this parameter, with a sufficiently fast
cooling schedule, leads to an optimal regret that is logarithmic with respect
to the number of episodes.Comment: Entrop
- …