10,002 research outputs found

    Bayesian threshold selection for extremal models using measures of surprise

    Full text link
    Statistical extreme value theory is concerned with the use of asymptotically motivated models to describe the extreme values of a process. A number of commonly used models are valid for observed data that exceed some high threshold. However, in practice a suitable threshold is unknown and must be determined for each analysis. While there are many threshold selection methods for univariate extremes, there are relatively few that can be applied in the multivariate setting. In addition, there are only a few Bayesian-based methods, which are naturally attractive in the modelling of extremes due to data scarcity. The use of Bayesian measures of surprise to determine suitable thresholds for extreme value models is proposed. Such measures quantify the level of support for the proposed extremal model and threshold, without the need to specify any model alternatives. This approach is easily implemented for both univariate and multivariate extremes.Comment: To appear in Computational Statistics and Data Analysi

    Forecasting Irish inflation using ARIMA models

    Get PDF
    This paper outlines the practical steps which need to be undertaken to use autoregressive integrated moving average (ARIMA) time series models for forecasting Irish inflation. A framework for ARIMA forecasting is drawn up. It considers two alternative approaches to the issue of identifying ARIMA models - the Box Jenkins approach and the objective penalty function methods. The emphasis is on forecast performance which suggests more focus on minimising out-of-sample forecast errors than on maximising in-sample 'goodness of fit'. Thus, the approach followed is unashamedly one of 'model mining' with the aim of optimising forecast performance. Practical issues in ARIMA time series forecasting are illustrated with reference to the harmonised index of consumer prices (HICP) and some of its major sub-components.

    Power-law distributions in binned empirical data

    Full text link
    Many man-made and natural phenomena, including the intensity of earthquakes, population of cities and size of international wars, are believed to follow power-law distributions. The accurate identification of power-law patterns has significant consequences for correctly understanding and modeling complex systems. However, statistical evidence for or against the power-law hypothesis is complicated by large fluctuations in the empirical distribution's tail, and these are worsened when information is lost from binning the data. We adapt the statistically principled framework for testing the power-law hypothesis, developed by Clauset, Shalizi and Newman, to the case of binned data. This approach includes maximum-likelihood fitting, a hypothesis test based on the Kolmogorov--Smirnov goodness-of-fit statistic and likelihood ratio tests for comparing against alternative explanations. We evaluate the effectiveness of these methods on synthetic binned data with known structure, quantify the loss of statistical power due to binning, and apply the methods to twelve real-world binned data sets with heavy-tailed patterns.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS710 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A unified approach to structural change tests based on F statistics, OLS residuals, and ML scores

    Get PDF
    Three classes of structural change tests (or tests for parameter instability) which have been receiving much attention in both the statistics and econometrics communities but have been developed in rather loosely connected lines of research are unified by embedding them into the framework of generalized M-fluctuation tests (Zeileis and Hornik, 2003). These classes are tests based on F statistics (supF, aveF, expF tests), on OLS residuals (OLS-based CUSUM and MOSUM tests) and on maximum likelihood scores (including the Nyblom-Hansen test). We show that (represantives from) these classes are special cases of the generalized M-fluctuation tests, based on the same functional central limit theorem, but employing different functionals for capturing excessive fluctuations. After embedding these tests into the same framework and thus understanding the relationship between these procedures for testing in historical samples, it is shown how the tests can also be extended to a monitoring situation. This is achieved by establishing a general M-fluctuation monitoring procedure and then applying the different functionals corresponding to monitoring with F statistics, OLS residuals and ML scores. In particular, an extension of the supF test to a monitoring scenario is suggested and illustrated on a real-world data set.Series: Research Report Series / Department of Statistics and Mathematic

    Information Recovery In Behavioral Networks

    Get PDF
    In the context of agent based modeling and network theory, we focus on the problem of recovering behavior-related choice information from origin-destination type data, a topic also known under the name of network tomography. As a basis for predicting agents' choices we emphasize the connection between adaptive intelligent behavior, causal entropy maximization and self-organized behavior in an open dynamic system. We cast this problem in the form of binary and weighted networks and suggest information theoretic entropy-driven methods to recover estimates of the unknown behavioral flow parameters. Our objective is to recover the unknown behavioral values across the ensemble analytically, without explicitly sampling the configuration space. In order to do so, we consider the Cressie-Read family of entropic functionals, enlarging the set of estimators commonly employed to make optimal use of the available information. More specifically, we explicitly work out two cases of particular interest: Shannon functional and the likelihood functional. We then employ them for the analysis of both univariate and bivariate data sets, comparing their accuracy in reproducing the observed trends.Comment: 14 pages, 6 figures, 4 table

    Spatial copula modeling of extreme crop insurance claims in Brazil

    Get PDF
    We use robustly estimated spatial R-vine copula models to assess spatial dependencies among extreme crop insurance claims. A truthful predictive model for simultaneous extreme losses is derived based on the linear structure found between copula parameters and distances between groups. Findings are compared to those from classical estimation of pair-copulas. Univariate ïŹts of the excess-losses are based on the Generalized Pareto distribution. The dependence implied by the spatial component is captured by the Gumbel copulas in Tree 1, whereas a few atypical points are handled by robust inference which reveals that the inïŹ‚uence of joint multivariate extreme outliers can not be neglected. Our ïŹndings are useful for crop insurance ïŹrms as well as for local authorities trying to minimize the eïŹ€ects of the natural disasters.Neste artigo utilizamos modelos de cĂłpulas R-vine espaciais e estimação robusta para acessar as dependĂȘncias entre os seguros relacionados Ă  ocorrĂȘncia de eventos extremos afetando as colheitas. Um modelo preditivo bastante eficiente para perdas extremas simultĂąneas Ă© derivado com base na estrutura linear encontrada entre os parĂąmetros da cĂłpula e as distĂąncias entre os grupos. Os achados sĂŁo comparados com os da estimativa clĂĄssica de pair-copulas. Os ajustes univariados das perdas em excesso sĂŁo feitos utilizando-se a distribuição generalizada de Pareto. A dependĂȘncia espacial Ă© capturada pelas cĂłpulas tipo Gumbel na Árvore 1, enquanto alguns poucos pontos atĂ­picos detectados pela inferĂȘncia robusta revelam que a influĂȘncia de extremos multivariados nĂŁo pode ser negligenciada. Nossas descobertas sĂŁo Ășteis para empresas de seguros agrĂ­colas, bem como para autoridades locais que tentam minimizar os efeitos dos desastres naturais
    • 

    corecore