9,055 research outputs found
A Noise-Robust Fast Sparse Bayesian Learning Model
This paper utilizes the hierarchical model structure from the Bayesian Lasso
in the Sparse Bayesian Learning process to develop a new type of probabilistic
supervised learning approach. The hierarchical model structure in this Bayesian
framework is designed such that the priors do not only penalize the unnecessary
complexity of the model but will also be conditioned on the variance of the
random noise in the data. The hyperparameters in the model are estimated by the
Fast Marginal Likelihood Maximization algorithm which can achieve sparsity, low
computational cost and faster learning process. We compare our methodology with
two other popular learning models; the Relevance Vector Machine and the
Bayesian Lasso. We test our model on examples involving both simulated and
empirical data, and the results show that this approach has several performance
advantages, such as being fast, sparse and also robust to the variance in
random noise. In addition, our method can give out a more stable estimation of
variance of random error, compared with the other methods in the study.Comment: 15 page
Student-t Processes as Alternatives to Gaussian Processes
We investigate the Student-t process as an alternative to the Gaussian
process as a nonparametric prior over functions. We derive closed form
expressions for the marginal likelihood and predictive distribution of a
Student-t process, by integrating away an inverse Wishart process prior over
the covariance kernel of a Gaussian process model. We show surprising
equivalences between different hierarchical Gaussian process models leading to
Student-t processes, and derive a new sampling scheme for the inverse Wishart
process, which helps elucidate these equivalences. Overall, we show that a
Student-t process can retain the attractive properties of a Gaussian process --
a nonparametric representation, analytic marginal and predictive distributions,
and easy model selection through covariance kernels -- but has enhanced
flexibility, and predictive covariances that, unlike a Gaussian process,
explicitly depend on the values of training observations. We verify empirically
that a Student-t process is especially useful in situations where there are
changes in covariance structure, or in applications like Bayesian optimization,
where accurate predictive covariances are critical for good performance. These
advantages come at no additional computational cost over Gaussian processes.Comment: 13 pages, 6 figures, 1 table. To appear in "The Seventeenth
International Conference on Artificial Intelligence and Statistics (AISTATS),
2014.
Interpretable statistics for complex modelling: quantile and topological learning
As the complexity of our data increased exponentially in the last decades, so has our
need for interpretable features. This thesis revolves around two paradigms to approach
this quest for insights.
In the first part we focus on parametric models, where the problem of interpretability
can be seen as a āparametrization selectionā. We introduce a quantile-centric
parametrization and we show the advantages of our proposal in the context of regression,
where it allows to bridge the gap between classical generalized linear (mixed)
models and increasingly popular quantile methods.
The second part of the thesis, concerned with topological learning, tackles the
problem from a non-parametric perspective. As topology can be thought of as a way
of characterizing data in terms of their connectivity structure, it allows to represent
complex and possibly high dimensional through few features, such as the number of
connected components, loops and voids. We illustrate how the emerging branch of
statistics devoted to recovering topological structures in the data, Topological Data
Analysis, can be exploited both for exploratory and inferential purposes with a special
emphasis on kernels that preserve the topological information in the data.
Finally, we show with an application how these two approaches can borrow strength
from one another in the identification and description of brain activity through fMRI
data from the ABIDE project
Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a
challenging problem in statistics and AI. Classical approaches for {exploratory
data analysis} are usually not flexible enough to deal with the uncertainty
inherent to real-world data: they are often restricted to fixed latent
interaction models and homogeneous likelihoods; they are sensitive to missing,
corrupt and anomalous data; moreover, their expressiveness generally comes at
the price of intractable inference. As a result, supervision from statisticians
is usually needed to find the right model for the data. However, since domain
experts are not necessarily also experts in statistics, we propose Automatic
Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible
at large. Specifically, ABDA allows for automatic and efficient missing value
estimation, statistical data type and likelihood discovery, anomaly detection
and dependency structure mining, on top of providing accurate density
estimation. Extensive empirical evidence shows that ABDA is a suitable tool for
automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
On Sharp Identification Regions for Regression Under Interval Data
The reliable analysis of interval data (coarsened data) is one of the
most promising applications of imprecise probabilities in statistics. If one
refrains from making untestable, and often materially unjustified, strong
assumptions on the coarsening process, then the empirical distribution
of the data is imprecise, and statistical models are, in Manskiās terms,
partially identified. We first elaborate some subtle differences between
two natural ways of handling interval data in the dependent variable of
regression models, distinguishing between two different types of identification
regions, called Sharp Marrow Region (SMR) and Sharp Collection
Region (SCR) here. Focusing on the case of linear regression analysis, we
then derive some fundamental geometrical properties of SMR and SCR,
allowing a comparison of the regions and providing some guidelines for
their canonical construction.
Relying on the algebraic framework of adjunctions of two mappings between
partially ordered sets, we characterize SMR as a right adjoint and
as the monotone kernel of a criterion function based mapping, while SCR
is indeed interpretable as the corresponding monotone hull. Finally we
sketch some ideas on a compromise between SMR and SCR based on a
set-domained loss function.
This paper is an extended version of a shorter paper with the same title,
that is conditionally accepted for publication in the Proceedings of
the Eighth International Symposium on Imprecise Probability: Theories
and Applications. In the present paper we added proofs and the seventh
chapter with a small Monte-Carlo-Illustration, that would have made the
original paper too long
On the shape of posterior densities and credible sets in instrumental variable regression models with reduced rank: an application of flexible sampling methods using neural networks
Likelihoods and posteriors of instrumental variable regression models with strongendogeneity and/or weak instruments may exhibit rather non-elliptical contours inthe parameter space. This may seriously affect inference based on Bayesian crediblesets. When approximating such contours using Monte Carlo integration methods likeimportance sampling or Markov chain Monte Carlo procedures the speed of the algorithmand the quality of the results greatly depend on the choice of the importance orcandidate density. Such a density has to be `close' to the target density in order toyield accurate results with numerically efficient sampling. For this purpose we introduce neural networks which seem to be natural importance or candidate densities, as they have a universal approximation property and are easy to sample from.A key step in the proposed class of methods is the construction of a neural network that approximates the target density accurately. The methods are tested on a set ofillustrative models. The results indicate the feasibility of the neural networkapproach.Markov chain Monte Carlo;Bayesian inference;credible sets;importance sampling;instrumental variables;neural networks;reduced rank
Recent advances in directional statistics
Mainstream statistical methodology is generally applicable to data observed
in Euclidean space. There are, however, numerous contexts of considerable
scientific interest in which the natural supports for the data under
consideration are Riemannian manifolds like the unit circle, torus, sphere and
their extensions. Typically, such data can be represented using one or more
directions, and directional statistics is the branch of statistics that deals
with their analysis. In this paper we provide a review of the many recent
developments in the field since the publication of Mardia and Jupp (1999),
still the most comprehensive text on directional statistics. Many of those
developments have been stimulated by interesting applications in fields as
diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics,
image analysis, text mining, environmetrics, and machine learning. We begin by
considering developments for the exploratory analysis of directional data
before progressing to distributional models, general approaches to inference,
hypothesis testing, regression, nonparametric curve estimation, methods for
dimension reduction, classification and clustering, and the modelling of time
series, spatial and spatio-temporal data. An overview of currently available
software for analysing directional data is also provided, and potential future
developments discussed.Comment: 61 page
- ā¦