12,824 research outputs found

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

    Exploring wind direction and SO2 concentration by circular-linear density estimation

    Full text link
    The study of environmental problems usually requires the description of variables with different nature and the assessment of relations between them. In this work, an algorithm for flexible estimation of the joint density for a circular-linear variable is proposed. The method is applied for exploring the relation between wind direction and SO2 concentration in a monitoring station close to a power plant located in Galicia (NW-Spain), in order to compare the effectiveness of precautionary measures for pollutants reduction in two different years.Comment: 17 pages, 7 figures, 2 table

    Nonparametric Dynamic State Space Modeling of Observed Circular Time Series with Circular Latent States: A Bayesian Perspective

    Full text link
    Circular time series has received relatively little attention in statistics and modeling complex circular time series using the state space approach is non-existent in the literature. In this article we introduce a flexible Bayesian nonparametric approach to state space modeling of observed circular time series where even the latent states are circular random variables. Crucially, we assume that the forms of both observational and evolutionary functions, both of which are circular in nature, are unknown and time-varying. We model these unknown circular functions by appropriate wrapped Gaussian processes having desirable properties. We develop an effective Markov chain Monte Carlo strategy for implementing our Bayesian model, by judiciously combining Gibbs sampling and Metropolis-Hastings methods. Validation of our ideas with a simulation study and two real bivariate circular time series data sets, where we assume one of the variables to be unobserved, revealed very encouraging performance of our model and methods. We finally analyse a data consisting of directions of whale migration, considering the unobserved ocean current direction as the latent circular process of interest. The results that we obtain are encouraging, and the posterior predictive distribution of the observed process correctly predicts the observed whale movement.Comment: This significantly updated version will appear in Journal of Statistical Theory and Practic

    Probabilistic Inference from Arbitrary Uncertainty using Mixtures of Factorized Generalized Gaussians

    Full text link
    This paper presents a general and efficient framework for probabilistic inference and learning from arbitrary uncertain information. It exploits the calculation properties of finite mixture models, conjugate families and factorization. Both the joint probability density of the variables and the likelihood function of the (objective or subjective) observation are approximated by a special mixture model, in such a way that any desired conditional distribution can be directly obtained without numerical integration. We have developed an extended version of the expectation maximization (EM) algorithm to estimate the parameters of mixture models from uncertain training examples (indirect observations). As a consequence, any piece of exact or uncertain information about both input and output values is consistently handled in the inference and learning stages. This ability, extremely useful in certain situations, is not found in most alternative methods. The proposed framework is formally justified from standard probabilistic principles and illustrative examples are provided in the fields of nonparametric pattern classification, nonlinear regression and pattern completion. Finally, experiments on a real application and comparative results over standard databases provide empirical evidence of the utility of the method in a wide range of applications

    Scalable Bayesian nonparametric measures for exploring pairwise dependence via Dirichlet Process Mixtures

    Get PDF
    In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a "null model" of pairwise independence. In simulation studies, as well as for a real data analysis, we show that our approach provides a useful tool for the exploratory nonparametric Bayesian analysis of large multivariate data sets
    • …
    corecore