12,824 research outputs found
Recent advances in directional statistics
Mainstream statistical methodology is generally applicable to data observed
in Euclidean space. There are, however, numerous contexts of considerable
scientific interest in which the natural supports for the data under
consideration are Riemannian manifolds like the unit circle, torus, sphere and
their extensions. Typically, such data can be represented using one or more
directions, and directional statistics is the branch of statistics that deals
with their analysis. In this paper we provide a review of the many recent
developments in the field since the publication of Mardia and Jupp (1999),
still the most comprehensive text on directional statistics. Many of those
developments have been stimulated by interesting applications in fields as
diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics,
image analysis, text mining, environmetrics, and machine learning. We begin by
considering developments for the exploratory analysis of directional data
before progressing to distributional models, general approaches to inference,
hypothesis testing, regression, nonparametric curve estimation, methods for
dimension reduction, classification and clustering, and the modelling of time
series, spatial and spatio-temporal data. An overview of currently available
software for analysing directional data is also provided, and potential future
developments discussed.Comment: 61 page
Exploring wind direction and SO2 concentration by circular-linear density estimation
The study of environmental problems usually requires the description of
variables with different nature and the assessment of relations between them.
In this work, an algorithm for flexible estimation of the joint density for a
circular-linear variable is proposed. The method is applied for exploring the
relation between wind direction and SO2 concentration in a monitoring station
close to a power plant located in Galicia (NW-Spain), in order to compare the
effectiveness of precautionary measures for pollutants reduction in two
different years.Comment: 17 pages, 7 figures, 2 table
Nonparametric Dynamic State Space Modeling of Observed Circular Time Series with Circular Latent States: A Bayesian Perspective
Circular time series has received relatively little attention in statistics
and modeling complex circular time series using the state space approach is
non-existent in the literature. In this article we introduce a flexible
Bayesian nonparametric approach to state space modeling of observed circular
time series where even the latent states are circular random variables.
Crucially, we assume that the forms of both observational and evolutionary
functions, both of which are circular in nature, are unknown and time-varying.
We model these unknown circular functions by appropriate wrapped Gaussian
processes having desirable properties.
We develop an effective Markov chain Monte Carlo strategy for implementing
our Bayesian model, by judiciously combining Gibbs sampling and
Metropolis-Hastings methods. Validation of our ideas with a simulation study
and two real bivariate circular time series data sets, where we assume one of
the variables to be unobserved, revealed very encouraging performance of our
model and methods.
We finally analyse a data consisting of directions of whale migration,
considering the unobserved ocean current direction as the latent circular
process of interest. The results that we obtain are encouraging, and the
posterior predictive distribution of the observed process correctly predicts
the observed whale movement.Comment: This significantly updated version will appear in Journal of
Statistical Theory and Practic
Probabilistic Inference from Arbitrary Uncertainty using Mixtures of Factorized Generalized Gaussians
This paper presents a general and efficient framework for probabilistic
inference and learning from arbitrary uncertain information. It exploits the
calculation properties of finite mixture models, conjugate families and
factorization. Both the joint probability density of the variables and the
likelihood function of the (objective or subjective) observation are
approximated by a special mixture model, in such a way that any desired
conditional distribution can be directly obtained without numerical
integration. We have developed an extended version of the expectation
maximization (EM) algorithm to estimate the parameters of mixture models from
uncertain training examples (indirect observations). As a consequence, any
piece of exact or uncertain information about both input and output values is
consistently handled in the inference and learning stages. This ability,
extremely useful in certain situations, is not found in most alternative
methods. The proposed framework is formally justified from standard
probabilistic principles and illustrative examples are provided in the fields
of nonparametric pattern classification, nonlinear regression and pattern
completion. Finally, experiments on a real application and comparative results
over standard databases provide empirical evidence of the utility of the method
in a wide range of applications
Scalable Bayesian nonparametric measures for exploring pairwise dependence via Dirichlet Process Mixtures
In this article we propose novel Bayesian nonparametric methods using
Dirichlet Process Mixture (DPM) models for detecting pairwise dependence
between random variables while accounting for uncertainty in the form of the
underlying distributions. A key criteria is that the procedures should scale to
large data sets. In this regard we find that the formal calculation of the
Bayes factor for a dependent-vs.-independent DPM joint probability measure is
not feasible computationally. To address this we present Bayesian diagnostic
measures for characterising evidence against a "null model" of pairwise
independence. In simulation studies, as well as for a real data analysis, we
show that our approach provides a useful tool for the exploratory nonparametric
Bayesian analysis of large multivariate data sets
- …