15,254 research outputs found
Empirical Bernstein stopping
Sampling is a popular way of scaling up machine learning algorithms to large datasets. The question often is how many samples are needed. Adaptive stopping algorithms monitor the performance in an online fashion and they can stop early, saving valuable resources. We consider problems where probabilistic guarantees are desired and demonstrate how recently-introduced empirical Bernstein bounds can be used to design stopping rules that are efficient. We provide upper bounds on the sample complexity of the new rules, as well as empirical results on model selection and boosting in the filtering setting
msBP: An R package to perform Bayesian nonparametric inference using multiscale Bernstein polynomials mixtures
msBP is an R package that implements a new method to perform Bayesian multiscale nonparametric inference introduced by Canale and Dunson (2016). The method, based on mixtures of multiscale beta dictionary densities, overcomes the drawbacks of PĂłlya trees and inherits many of the advantages of Dirichlet process mixture models. The key idea is that an infinitely-deep binary tree is introduced, with a beta dictionary density assigned to each node of the tree. Using a multiscale stick-breaking characterization, stochastically decreasing weights are assigned to each node. The result is an infinite mixture model. The package msBP implements a series of basic functions to deal with this family of priors such as random densities and numbers generation, creation and manipulation of binary tree objects, and generic functions to plot and print the results. In addition, it implements the Gibbs samplers for posterior computation to perform multiscale density estimation and multiscale testing of group differences described in Canale and Dunson (2016)
Convergence rates of Kernel Conjugate Gradient for random design regression
We prove statistical rates of convergence for kernel-based least squares
regression from i.i.d. data using a conjugate gradient algorithm, where
regularization against overfitting is obtained by early stopping. This method
is related to Kernel Partial Least Squares, a regression method that combines
supervised dimensionality reduction with least squares projection. Following
the setting introduced in earlier related literature, we study so-called "fast
convergence rates" depending on the regularity of the target regression
function (measured by a source condition in terms of the kernel integral
operator) and on the effective dimensionality of the data mapped into the
kernel space. We obtain upper bounds, essentially matching known minimax lower
bounds, for the (prediction) norm as well as for the stronger
Hilbert norm, if the true regression function belongs to the reproducing kernel
Hilbert space. If the latter assumption is not fulfilled, we obtain similar
convergence rates for appropriate norms, provided additional unlabeled data are
available
Multiscale Bernstein polynomials for densities
Our focus is on constructing a multiscale nonparametric prior for densities.
The Bayes density estimation literature is dominated by single scale methods,
with the exception of Polya trees, which favor overly-spiky densities even when
the truth is smooth. We propose a multiscale Bernstein polynomial family of
priors, which produce smooth realizations that do not rely on hard partitioning
of the support. At each level in an infinitely-deep binary tree, we place a
beta dictionary density; within a scale the densities are equivalent to
Bernstein polynomials. Using a stick-breaking characterization, stochastically
decreasing weights are allocated to the finer scale dictionary elements. A
slice sampler is used for posterior computation, and properties are described.
The method characterizes densities with locally-varying smoothness, and can
produce a sequence of coarse to fine density estimates. An extension for
Bayesian testing of group differences is introduced and applied to DNA
methylation array data
- …