36 research outputs found

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

    Nonparametric inference with directional and linear data

    Get PDF
    The term directional data refers to data whose support is a circumference, a sphere or, generally, an hypersphere of arbitrary dimension. This kind of data appears naturally in several applied disciplines: proteomics, environmental sciences, biology, astronomy, image analysis or text mining. The aim of this thesis is to provide new methodological tools for nonparametric inference with directional and linear data (i.e., usual Euclidean data). Nonparametric methods are obtained for both estimation and testing, for the density and the regression curves, in situations where directional random variables are present, that is, directional, directional-linear and directional-directional random variables. The main contributions of the thesis are collected in six papers briefly described in what follows. In García-Portugués et al. (2013a) different ways of estimating circular-linear and circularcircular densities via copulas are explored for an environmental application. A new directionallinear kernel density estimator is introduced in García-Portugués et al. (2013b) together with its basic properties. Three new bandwidth selectors for the kernel density estimator with directional data are given in García-Portugués (2013) and compared with the available ones. The directional-linear estimator is used in García-Portugués et al. (2014a) for constructing an independence test for directional and linear variables that is applied to study the dependence between wildfire orientation and size. In García-Portugués et al. (2014b) a central limit theorem for the integrated squared error of the directional-linear estimator is presented. This result is used to derive the asymptotic distribution of the independence test and of a goodness-of-fit test for parametric directional-linear and directional-directional densities. Finally, a local linear estimator with directional predictor and linear response is given in García-Portugués et al. (2014) jointly with a goodness-of-fit test for parametric regression functions

    A Framework for Statistical Modeling of Wind Speed and Wind Direction

    Get PDF
    Atmospheric near surface wind speed and wind direction play an important role in many applications, ranging from air quality modeling, building design, wind turbine placement to climate change research. It is therefore crucial to accurately estimate the joint probability distribution of wind speed and direction. This dissertation aims to provide a modeling framework for studying the variation of wind speed and wind direction. To this end, three projects are conducted to address some of the key issues for modeling wind vectors.\\ First, a conditional decomposition approach is developed to model the joint distribution of wind speed and direction. Specifically, the joint distribution is decomposed into the product of the marginal distribution of wind direction and the conditional distribution of wind speed given wind direction. Von Mises mixture model is used to accommodate the circular nature of wind direction. The conditional wind speed distribution is modeled as a directional dependent Weibull distribution via a two-stage estimation procedure, consisting of a directional binned Weibull parameter estimation, followed by a harmonic regression to estimate the functional dependence of the Weibull parameters on wind direction. The conditional decomposition approach allows the modeling of complex distributions with relatively simple and flexible univariate models. Moreover, by studying the variations of wind speed with respect to wind direction, we gain valuable insights that would be overlooked if we solely focused on studying wind speed alone. These insights have significant implications for a wide range of applications involving wind data. This conditional modeling framework is further extended to investigate the potential enhancement of estimating extreme wind speeds. Specifically, parametric extreme value modeling approaches, including block maxima, peaks-over-thresholds, and point process methods, are utilized to model the upper tail of its conditional distribution. The purpose of this extension is to avoid misspecification issues associated with the Weibull model and to improve estimation efficiency. Simulation studies, analysis of output from climate model simulation, and model comparisons are discussed.\\ A key feature of the wind field data is its complicated temporal and spatial structure. Therefore, the final goal of this dissertation involves the spatio-temporal modeling of wind speed. The proposed model captures the seasonal variation and temporal and spatial variability by decomposing the wind speed process into the ``global structure\u27\u27 of the spatio-temporal mean component, the ``local structure\u27\u27 that consists of a combination of time varying empirical orthogonal functions (EOFs), and a first-order dynamical spatial Gaussian process (GP). A crucial element of the proposed decomposition is leveraging the inherent circularity of the annual seasonal cycle to create effective replications in time. This enables us to employ more flexible nonstationary space-time modeling through EOF analysis and enhance computation efficiency using dynamical GPs

    Directional naive Bayes classifiers

    Get PDF
    Directional data are ubiquitous in science. These data have some special properties that rule out the use of classical statistics. Therefore, different distributions and statistics, such as the univariate von Mises and the multivariate von Mises–Fisher distributions, should be used to deal with this kind of information. We extend the naive Bayes classifier to the case where the conditional probability distributions of the predictive variables follow either of these distributions. We consider the simple scenario, where only directional predictive variables are used, and the hybrid case, where discrete, Gaussian and directional distributions are mixed. The classifier decision functions and their decision surfaces are studied at length. Artificial examples are used to illustrate the behavior of the classifiers. The proposed classifiers are then evaluated over eight datasets, showing competitive performances against other naive Bayes classifiers that use Gaussian distributions or discretization to manage directional data

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    Vol. 4, No. 1 (Full Issue)

    Get PDF

    Seventh International Workshop on Simulation, 21-25 May, 2013, Department of Statistical Sciences, Unit of Rimini, University of Bologna, Italy. Book of Abstracts

    Get PDF
    Seventh International Workshop on Simulation, 21-25 May, 2013, Department of Statistical Sciences, Unit of Rimini, University of Bologna, Italy. Book of Abstract

    A comparison of the CAR and DAGAR spatial random effects models with an application to diabetics rate estimation in Belgium

    Get PDF
    When hierarchically modelling an epidemiological phenomenon on a finite collection of sites in space, one must always take a latent spatial effect into account in order to capture the correlation structure that links the phenomenon to the territory. In this work, we compare two autoregressive spatial models that can be used for this purpose: the classical CAR model and the more recent DAGAR model. Differently from the former, the latter has a desirable property: its ρ parameter can be naturally interpreted as the average neighbor pair correlation and, in addition, this parameter can be directly estimated when the effect is modelled using a DAGAR rather than a CAR structure. As an application, we model the diabetics rate in Belgium in 2014 and show the adequacy of these models in predicting the response variable when no covariates are available
    corecore