213 research outputs found
Recommended from our members
Statistical Inference for Structured High-dimensional Models
High-dimensional statistical inference is a newly emerged direction of statistical science in the 21 century. Its importance is due to the increasing dimensionality and complexity of models needed to process and understand the modern real world data. The main idea making possible meaningful inference about such models is to assume suitable lower dimensional underlying structure or low-dimensional approximations, for which the error can be reasonably controlled. Several types of such structures have been recently introduced including sparse high-dimensional regression, sparse and/or low rank matrix models, matrix completion models, dictionary learning, network models (stochastic block model, mixed membership models) and more. The workshop focused on recent developments in structured sequence and regression models, matrix and tensor estimation, robustness, statistical learning in complex settings, network data, and topic models
Recommended from our members
Learning Theory and Approximation
Learning theory studies data structures from samples and aims at understanding unknown function relations behind them. This leads to interesting theoretical problems which can be often attacked with methods from Approximation Theory. This workshop - the second one of this type at the MFO - has concentrated on the following recent topics: Learning of manifolds and the geometry of data; sparsity and dimension reduction; error analysis and algorithmic aspects, including kernel based methods for regression and classification; application of multiscale aspects and of refinement algorithms to learning
On using Reproducing Kernel Hilbert Spaces for the analysis of Replicated Spatial Point Processes
This paper focuses on the use of the theory of Reproducing
Kernel Hilbert Spaces in the statistical analysis of replicated point
processes. We show that spatial point processes can be observed
as random variables in a Reproducing Kernel Hilbert Space and,
as a result, methodological and theoretical results for statistical
analysis in these spaces can be applied to them. In particular and
by way of illustration, we show how we can use the proposed
methodology to identify differences between several classes of
replicated point patterns using the Box’s M and MANOVA tests,
and to classify a new observation, using Discriminant Functions.Funding for open access charge: CRUE-Universitat Jaume
Nonparametric Involutive Markov Chain Monte Carlo: a MCMC algorithm for universal probabilistic programming
Probabilistic programming, the idea to write probabilistic models as computer programs, has proven to be a powerful tool for statistical analysis thanks to the computation power of built-in inference algorithms. Developing suitable inference algorithms that work for arbitrary programs in a Turing-complete probabilistic programming language (PPL) has become increasingly important. This thesis presents the Nonparametric Involutive Markov chain Monte Carlo (NP-iMCMC) framework for the construction of MCMC inference machines for nonparametric models that can be expressed in Turing-complete PPLs. Relying on the tree representable structure of probabilistic programs, the NP-iMCMC algorithm automates the trans-dimensional movement in the sampling process and only requires the specification of proposal distributions and mappings on fixed dimensional spaces which are provided by inferences like the popular Hamiltonian Monte Carlo (HMC). We gave a theoretical justification for the NP-iMCMC algorithm and put NP-iMCMC into action by introducing the Nonparametric HMC (NP-HMC) algorithm, a nonparametric variant of the HMC sampler. This NP-HMC sampler works out-of-the-box and can be applied to virtually all useful probabilistic models. We further improved NP-HMC by applying the techniques specified for NP-iMCMC to construct irreversible extensions that have shown significant performance improvements against other existing inference methods
Statistical computation with kernels
Modern statistical inference has seen a tremendous increase in the size and complexity of models and datasets. As such, it has become reliant on advanced com- putational tools for implementation. A first canonical problem in this area is the numerical approximation of integrals of complex and expensive functions. Numerical integration is required for a variety of tasks, including prediction, model comparison and model choice. A second canonical problem is that of statistical inference for models with intractable likelihoods. These include models with intractable normal- isation constants, or models which are so complex that their likelihood cannot be evaluated, but from which data can be generated. Examples include large graphical models, as well as many models in imaging or spatial statistics.
This thesis proposes to tackle these two problems using tools from the kernel methods and Bayesian non-parametrics literature. First, we analyse a well-known algorithm for numerical integration called Bayesian quadrature, and provide consis- tency and contraction rates. The algorithm is then assessed on a variety of statistical inference problems, and extended in several directions in order to reduce its compu- tational requirements. We then demonstrate how the combination of reproducing kernels with Stein’s method can lead to computational tools which can be used with unnormalised densities, including numerical integration and approximation of probability measures. We conclude by studying two minimum distance estimators derived from kernel-based statistical divergences which can be used for unnormalised and generative models.
In each instance, the tractability provided by reproducing kernels and their properties allows us to provide easily-implementable algorithms whose theoretical foundations can be studied in depth
CLADAG 2021 BOOK OF ABSTRACTS AND SHORT PAPERS
The book collects the short papers presented at the 13th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society (SIS). The meeting has been organized by the Department of Statistics, Computer Science and Applications of the University of Florence, under the auspices of the Italian Statistical Society and the International Federation of Classification Societies (IFCS). CLADAG is a member of the IFCS, a federation of national, regional, and linguistically-based classification societies. It is a non-profit, non-political scientific organization, whose aims are to further classification research
- …