42 research outputs found
Qualitative Robustness in Bayesian Inference
The practical implementation of Bayesian inference requires numerical
approximation when closed-form expressions are not available. What types of
accuracy (convergence) of the numerical approximations guarantee robustness and
what types do not? In particular, is the recursive application of Bayes' rule
robust when subsequent data or posteriors are approximated? When the prior is
the push forward of a distribution by the map induced by the solution of a PDE,
in which norm should that solution be approximated? Motivated by such
questions, we investigate the sensitivity of the distribution of posterior
distributions (i.e. posterior distribution-valued random variables, randomized
through the data) with respect to perturbations of the prior and data
generating distributions in the limit when the number of data points grows
towards infinity
Brittleness of Bayesian inference and new Selberg formulas
The incorporation of priors in the Optimal Uncertainty Quantification (OUQ)
framework \cite{OSSMO:2011} reveals brittleness in Bayesian inference; a model
may share an arbitrarily large number of finite-dimensional marginals with, or
be arbitrarily close (in Prokhorov or total variation metrics) to, the
data-generating distribution and still make the largest possible prediction
error after conditioning on an arbitrarily large number of samples. The initial
purpose of this paper is to unwrap this brittleness mechanism by providing (i)
a quantitative version of the Brittleness Theorem of \cite{BayesOUQ} and (ii) a
detailed and comprehensive analysis of its application to the revealing example
of estimating the mean of a random variable on the unit interval using
priors that exactly capture the distribution of an arbitrarily large number of
Hausdorff moments.
However, in doing so, we discovered that the free parameter associated with
Markov and Kre\u{\i}n's canonical representations of truncated Hausdorff
moments generates reproducing kernel identities corresponding to reproducing
kernel Hilbert spaces of polynomials.
Furthermore, these reproducing identities lead to biorthogonal systems of
Selberg integral formulas.
This process of discovery appears to be generic: whereas Karlin and Shapley
used Selberg's integral formula to first compute the volume of the Hausdorff
moment space (the polytope defined by the first moments of a probability
measure on the interval ), we observe that the computation of that
volume along with higher order moments of the uniform measure on the moment
space, using different finite-dimensional representations of subsets of the
infinite-dimensional set of probability measures on representing the
first moments, leads to families of equalities corresponding to classical
and new Selberg identities.Comment: 73 pages. Keywords: Bayesian inference, misspecification, robustness,
uncertainty quantification, optimal uncertainty quantification, reproducing
kernel Hilbert spaces (RKHS), Selberg integral formula
Fast rates for support vector machines using Gaussian kernels
For binary classification we establish learning rates up to the order of
for support vector machines (SVMs) with hinge loss and Gaussian RBF
kernels. These rates are in terms of two assumptions on the considered
distributions: Tsybakov's noise assumption to establish a small estimation
error, and a new geometric noise condition which is used to bound the
approximation error. Unlike previously proposed concepts for bounding the
approximation error, the geometric noise assumption does not employ any
smoothness assumption.Comment: Published at http://dx.doi.org/10.1214/009053606000001226 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Extreme points of a ball about a measure with finite support
We show that, for the space of Borel probability measures on a Borel subset
of a Polish metric space, the extreme points of the Prokhorov,
Monge-Wasserstein and Kantorovich metric balls about a measure whose support
has at most n points, consist of measures whose supports have at most n+2
points. Moreover, we use the Strassen and Kantorovich-Rubinstein duality
theorems to develop representations of supersets of the extreme points based on
linear programming, and then develop these representations towards the goal of
their efficient computation
Conditioning Gaussian measure on Hilbert space
For a Gaussian measure on a separable Hilbert space with covariance operator
, we show that the family of conditional measures associated with
conditioning on a closed subspace are Gaussian with covariance
operator the short of the operator to . We provide two
proofs. The first uses the theory of Gaussian Hilbert spaces and a
characterization of the shorted operator by Andersen and Trapp. The second uses
recent developments by Corach, Maestripieri and Stojanoff on the relationship
between the shorted operator and -symmetric oblique projections onto
. To obtain the assertion when such projections do not exist, we
develop an approximation result for the shorted operator by showing, for any
positive operator , how to construct a sequence of approximating operators
which possess -symmetric oblique projections onto
such that the sequence of shorted operators converges to
in the weak operator topology. This result combined with the
martingale convergence of random variables associated with the corresponding
approximations establishes the main assertion in general. Moreover, it
in turn strengthens the approximation theorem for shorted operator when the
operator is trace class; then the sequence of shorted operators
converges to in trace norm
Universal Scalable Robust Solvers from Computational Information Games and fast eigenspace adapted Multiresolution Analysis
We show how the discovery of robust scalable numerical solvers for arbitrary
bounded linear operators can be automated as a Game Theory problem by
reformulating the process of computing with partial information and limited
resources as that of playing underlying hierarchies of adversarial information
games. When the solution space is a Banach space endowed with a quadratic
norm , the optimal measure (mixed strategy) for such games (e.g. the
adversarial recovery of , given partial measurements with
, using relative error in -norm as a loss) is a
centered Gaussian field solely determined by the norm , whose
conditioning (on measurements) produces optimal bets. When measurements are
hierarchical, the process of conditioning this Gaussian field produces a
hierarchy of elementary bets (gamblets). These gamblets generalize the notion
of Wavelets and Wannier functions in the sense that they are adapted to the
norm and induce a multi-resolution decomposition of that is
adapted to the eigensubspaces of the operator defining the norm .
When the operator is localized, we show that the resulting gamblets are
localized both in space and frequency and introduce the Fast Gamblet Transform
(FGT) with rigorous accuracy and (near-linear) complexity estimates. As the FFT
can be used to solve and diagonalize arbitrary PDEs with constant coefficients,
the FGT can be used to decompose a wide range of continuous linear operators
(including arbitrary continuous linear bijections from to or
to ) into a sequence of independent linear systems with uniformly bounded
condition numbers and leads to
solvers and eigenspace adapted Multiresolution Analysis (resulting in near
linear complexity approximation of all eigensubspaces).Comment: 142 pages. 14 Figures. Presented at AFOSR (Aug 2016), DARPA (Sep
2016), IPAM (Apr 3, 2017), Hausdorff (April 13, 2017) and ICERM (June 5,
2017