34 research outputs found

    Skew-rotationally-symmetric distributions and related efficient inferential procedures

    Get PDF
    peer reviewedMost commonly used distributions on the unit hypersphere Sk−1={v∈Rk:v⊤v=1}, k≥2, assume that the data are rotationally symmetric about some direction θ∈Sk−1. However, there is empirical evidence that this assumption often fails to describe reality. We study in this paper a new class of skew-rotationally-symmetric distributions on Sk−1 that enjoy numerous good properties. We discuss the Fisher information structure of the model and derive efficient inferential procedures. In particular, we obtain the first semi-parametric test for rotational symmetry about a known direction. We also propose a second test for rotational symmetry, obtained through the definition of a new measure of skewness on the hypersphere. We investigate the finite-sample behavior of the new tests through a Monte Carlo simulation study. We conclude the paper with a discussion about some intriguing open questions related to our new models

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

    Coming together of Bayesian inference and skew spherical data

    Get PDF
    This paper presents Bayesian directional data modeling via the skew-rotationally-symmetric Fisher-von Mises-Langevin (FvML) distribution. The prior distributions for the parameters are a pivotal building block in Bayesian analysis, therefore, the impact of the proposed priors will be quantified using the Wasserstein Impact Measure (WIM) to guide the practitioner in the implementation process. For the computation of the posterior, modifications of Gibbs and slice samplings are applied for generating samples. We demonstrate the applicability of our contribution via synthetic and real data analyses. Our investigation paves the way for Bayesian analysis of skew circular and spherical data.The Visiting Professor programme, University of Pretoria and the National Research Foundation (NRF) of South Africa, SARChI Research Chair and DSINRF Centre of Excellence in Mathematical and Statistical Sciences (CoE-MaSS), South Africa.https://www.frontiersin.org/journals/big-datadm2022Statistic

    Curved factor analysis with the Ellipsoid-Gaussian distribution

    Full text link
    There is a need for new models for characterizing dependence in multivariate data. The multivariate Gaussian distribution is routinely used, but cannot characterize nonlinear relationships in the data. Most non-linear extensions tend to be highly complex; for example, involving estimation of a non-linear regression model in latent variables. In this article, we propose a relatively simple class of Ellipsoid-Gaussian multivariate distributions, which are derived by using a Gaussian linear factor model involving latent variables having a von Mises-Fisher distribution on a unit hyper-sphere. We show that the Ellipsoid-Gaussian distribution can flexibly model curved relationships among variables with lower-dimensional structures. Taking a Bayesian approach, we propose a hybrid of gradient-based geodesic Monte Carlo and adaptive Metropolis for posterior sampling. We derive basic properties and illustrate the utility of the Ellipsoid-Gaussian distribution on a variety of simulated and real data applications. An accompanying R package is also available

    Sine-skewed toroidal distributions and their application in protein bioinformatics

    Get PDF
    In the bioinformatics field, there has been a growing interest in modelling dihedral angles of amino acids by viewing them as data on the torus. This has motivated, over the past years, new proposals of distributions on the bivariate torus. The main drawback of most of these models is that the related densities are (pointwise) symmetric, despite the fact that the data usually present asymmetric patterns. This motivates the need to find a new way of constructing asymmetric toroidal distributions starting from a symmetric distribution. We tackle this problem in this paper by introducing the sine-skewed toroidal distributions. The general properties of the new models are derived. Based on the initial symmetric model, explicit expressions for the shape parameters are obtained, a simple algorithm for generating random numbers is provided, and asymptotic results for the maximum likelihood estimators are established. An important feature of our construction is that no normalizing constant needs to be calculated, leading to more flexible distributions without increasing the complexity of the models. The benefit of employing these new sine-skewed distributions is shown on the basis of protein data, where, in general, the new models outperform their symmetric antecedents

    Enhancing wind direction prediction of South Africa wind energy hotspots with Bayesian mixture modeling

    Get PDF
    Wind energy production depends not only on wind speed but also on wind direction. Thus, predicting and estimating the wind direction for sites accurately will enhance measuring the wind energy potential. The uncertain nature of wind direction can be presented through probability distributions and Bayesian analysis can improve the modeling of the wind direction using the contribution of the prior knowledge to update the empirical shreds of evidence. This must align with the nature of the empirical evidence as to whether the data are skew or multimodal or not. So far mixtures of von Mises within the directional statistics domain, are used for modeling wind direction to capture the multimodality nature present in the data. In this paper, due to the skewed and multimodal patterns of wind direction on diferent sites of the locations understudy, a mixture of multimodal skewed von Mises is proposed for wind direction. Furthermore, a Bayesian analysis is presented to take into account the uncertainty inherent in the proposed wind direction model. A simulation study is conducted to evaluate the performance of the proposed Bayesian model. This proposed model is ftted to datasets of wind direction of Marion island and two wind farms in South Africa and show the superiority of the approach. The posterior predictive distribution is applied to forecast the wind direction on a wind farm. It is concluded that the proposed model ofers an accurate prediction by means of credible intervals. The mean wind direction of Marion island in 2017 obtained from 1079 observations was 5.0242 (in radian) while using our proposed method the predicted mean wind direction and its corresponding 95% credible interval based on 100 generated samples from the posterior predictive distribution are obtained 5.0171 and (4.7442, 5.2900). Therefore, our results open a new approach for accurate prediction of wind direction implementing a Bayesian approach via mixture of skew circular distributions.https://www.nature.com/srepStatistic

    Statistical methods for random rotations

    Get PDF
    The analysis of orientation data is a growing field in statistics. Though the rotationally symmetric location model for orientation data is simple, statistical methods for estimation and inference for the location parameter, S are limited. In this dissertation we develop point estimation and confidence region methods for the central orientation. Both extrinsic and intrinsic approaches to estimating the central orientation S have been proposed in the literature, but no rigorous comparison of the approaches is available. In Chapter 2 we consider both intrinsic and extrinsic estimators of the central orientation and compare their statistical properties in a simulation study. In particular we consider the projected mean, geometric mean and geometric median. In addition we introduce the projected median as a novel robust estimator of the location parameter. The results of a simulation study suggest the projected median is the preferred estimator because of its low bias and mean square error. Non-parametric confidence regions for the central orientation have been proposed in the literature, but they have undesirable coverage rates for small samples. In Chapter 3 we propose a nonparametric pivotal bootstrap to calibrate confidence regions for the central orientation. We demonstrate the benefits of using calibrated confidence regions in a simulation study and prove the proposed bootstrap method is consistent. Robust statistical methods for estimating the central orientation has received very little attention. In Chapter 4 we explore the finite sample and asymptotic properties of the projected median. In particular we derive the asymptotic distribution of the projected median and show it is SB-robust for the Cayley and matrix Fisher distributions. Confidence regions for the central orientation S are proposed, which can be shown to have preferable finite sample coverage rates compared to those based on the projected mean. Finally the rotations package is developed in Chapter 5, which contains functions for the statistical analysis of rotation data in SO(3)

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    A survey of statistical network models

    Full text link
    Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active network community and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online networking communities such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference

    Nonparametric inference with directional and linear data

    Get PDF
    The term directional data refers to data whose support is a circumference, a sphere or, generally, an hypersphere of arbitrary dimension. This kind of data appears naturally in several applied disciplines: proteomics, environmental sciences, biology, astronomy, image analysis or text mining. The aim of this thesis is to provide new methodological tools for nonparametric inference with directional and linear data (i.e., usual Euclidean data). Nonparametric methods are obtained for both estimation and testing, for the density and the regression curves, in situations where directional random variables are present, that is, directional, directional-linear and directional-directional random variables. The main contributions of the thesis are collected in six papers briefly described in what follows. In García-Portugués et al. (2013a) different ways of estimating circular-linear and circularcircular densities via copulas are explored for an environmental application. A new directionallinear kernel density estimator is introduced in García-Portugués et al. (2013b) together with its basic properties. Three new bandwidth selectors for the kernel density estimator with directional data are given in García-Portugués (2013) and compared with the available ones. The directional-linear estimator is used in García-Portugués et al. (2014a) for constructing an independence test for directional and linear variables that is applied to study the dependence between wildfire orientation and size. In García-Portugués et al. (2014b) a central limit theorem for the integrated squared error of the directional-linear estimator is presented. This result is used to derive the asymptotic distribution of the independence test and of a goodness-of-fit test for parametric directional-linear and directional-directional densities. Finally, a local linear estimator with directional predictor and linear response is given in García-Portugués et al. (2014) jointly with a goodness-of-fit test for parametric regression functions
    corecore