10,002 research outputs found
Bayesian threshold selection for extremal models using measures of surprise
Statistical extreme value theory is concerned with the use of asymptotically
motivated models to describe the extreme values of a process. A number of
commonly used models are valid for observed data that exceed some high
threshold. However, in practice a suitable threshold is unknown and must be
determined for each analysis. While there are many threshold selection methods
for univariate extremes, there are relatively few that can be applied in the
multivariate setting. In addition, there are only a few Bayesian-based methods,
which are naturally attractive in the modelling of extremes due to data
scarcity. The use of Bayesian measures of surprise to determine suitable
thresholds for extreme value models is proposed. Such measures quantify the
level of support for the proposed extremal model and threshold, without the
need to specify any model alternatives. This approach is easily implemented for
both univariate and multivariate extremes.Comment: To appear in Computational Statistics and Data Analysi
Forecasting Irish inflation using ARIMA models
This paper outlines the practical steps which need to be undertaken to use autoregressive integrated moving average (ARIMA) time series models for forecasting Irish inflation. A framework for ARIMA forecasting is drawn up. It considers two alternative approaches to the issue of identifying ARIMA models - the Box Jenkins approach and the objective penalty function methods. The emphasis is on forecast performance which suggests more focus on minimising out-of-sample forecast errors than on maximising in-sample 'goodness of fit'. Thus, the approach followed is unashamedly one of 'model mining' with the aim of optimising forecast performance. Practical issues in ARIMA time series forecasting are illustrated with reference to the harmonised index of consumer prices (HICP) and some of its major sub-components.
Power-law distributions in binned empirical data
Many man-made and natural phenomena, including the intensity of earthquakes,
population of cities and size of international wars, are believed to follow
power-law distributions. The accurate identification of power-law patterns has
significant consequences for correctly understanding and modeling complex
systems. However, statistical evidence for or against the power-law hypothesis
is complicated by large fluctuations in the empirical distribution's tail, and
these are worsened when information is lost from binning the data. We adapt the
statistically principled framework for testing the power-law hypothesis,
developed by Clauset, Shalizi and Newman, to the case of binned data. This
approach includes maximum-likelihood fitting, a hypothesis test based on the
Kolmogorov--Smirnov goodness-of-fit statistic and likelihood ratio tests for
comparing against alternative explanations. We evaluate the effectiveness of
these methods on synthetic binned data with known structure, quantify the loss
of statistical power due to binning, and apply the methods to twelve real-world
binned data sets with heavy-tailed patterns.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS710 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A unified approach to structural change tests based on F statistics, OLS residuals, and ML scores
Three classes of structural change tests (or tests for parameter instability) which have been receiving much attention in both the statistics and econometrics communities but have been developed in rather loosely connected lines of research are unified by embedding them into the framework of generalized M-fluctuation tests (Zeileis and Hornik, 2003). These classes are tests based on F statistics (supF, aveF, expF tests), on OLS residuals (OLS-based CUSUM and MOSUM tests) and on maximum likelihood scores (including the Nyblom-Hansen test). We show that (represantives from) these classes are special cases of the generalized M-fluctuation tests, based on the same functional central limit theorem, but employing different functionals for capturing excessive fluctuations. After embedding these tests into the same framework and thus understanding the relationship between these procedures for testing in historical samples, it is shown how the tests can also be extended to a monitoring situation. This is achieved by establishing a general M-fluctuation monitoring procedure and then applying the different functionals corresponding to monitoring with F statistics, OLS residuals and ML scores. In particular, an extension of the supF test to a monitoring scenario is suggested and illustrated on a real-world data set.Series: Research Report Series / Department of Statistics and Mathematic
Information Recovery In Behavioral Networks
In the context of agent based modeling and network theory, we focus on the
problem of recovering behavior-related choice information from
origin-destination type data, a topic also known under the name of network
tomography. As a basis for predicting agents' choices we emphasize the
connection between adaptive intelligent behavior, causal entropy maximization
and self-organized behavior in an open dynamic system. We cast this problem in
the form of binary and weighted networks and suggest information theoretic
entropy-driven methods to recover estimates of the unknown behavioral flow
parameters. Our objective is to recover the unknown behavioral values across
the ensemble analytically, without explicitly sampling the configuration space.
In order to do so, we consider the Cressie-Read family of entropic functionals,
enlarging the set of estimators commonly employed to make optimal use of the
available information. More specifically, we explicitly work out two cases of
particular interest: Shannon functional and the likelihood functional. We then
employ them for the analysis of both univariate and bivariate data sets,
comparing their accuracy in reproducing the observed trends.Comment: 14 pages, 6 figures, 4 table
Spatial copula modeling of extreme crop insurance claims in Brazil
We use robustly estimated spatial R-vine copula models to assess spatial dependencies among extreme crop insurance claims. A truthful predictive model for simultaneous extreme losses is derived based on the linear structure found between copula parameters and distances between groups. Findings are compared to those from classical estimation of pair-copulas. Univariate ïŹts of the excess-losses are based on the Generalized Pareto distribution. The dependence implied by the spatial component is captured by the Gumbel copulas in Tree 1, whereas a few atypical points are handled by robust inference which reveals that the inïŹuence of joint multivariate extreme outliers can not be neglected. Our ïŹndings are useful for crop insurance ïŹrms as well as for local authorities trying to minimize the eïŹects of the natural disasters.Neste artigo utilizamos modelos de cĂłpulas R-vine espaciais e estimação robusta para acessar as dependĂȘncias entre os seguros relacionados Ă ocorrĂȘncia de eventos extremos afetando as colheitas. Um modelo preditivo bastante eficiente para perdas extremas simultĂąneas Ă© derivado com base na estrutura linear encontrada entre os parĂąmetros da cĂłpula e as distĂąncias entre os grupos. Os achados sĂŁo comparados com os da estimativa clĂĄssica de pair-copulas. Os ajustes univariados das perdas em excesso sĂŁo feitos utilizando-se a distribuição generalizada de Pareto. A dependĂȘncia espacial Ă© capturada pelas cĂłpulas tipo Gumbel na Ărvore 1, enquanto alguns poucos pontos atĂpicos detectados pela inferĂȘncia robusta revelam que a influĂȘncia de extremos multivariados nĂŁo pode ser negligenciada. Nossas descobertas sĂŁo Ășteis para empresas de seguros agrĂcolas, bem como para autoridades locais que tentam minimizar os efeitos dos desastres naturais
- âŠ