4,010 research outputs found

    Characterization of the frequency of extreme events by the Generalized Pareto Distribution

    Full text link
    Based on recent results in extreme value theory, we use a new technique for the statistical estimation of distribution tails. Specifically, we use the Gnedenko-Pickands-Balkema-de Haan theorem, which gives a natural limit law for peak-over-threshold values in the form of the Generalized Pareto Distribution (GPD). Useful in finance, insurance, hydrology, we investigate here the earthquake energy distribution described by the Gutenberg-Richter seismic moment-frequency law and analyze shallow earthquakes (depth h < 70 km) in the Harvard catalog over the period 1977-2000 in 18 seismic zones. The whole GPD is found to approximate the tails of the seismic moment distributions quite well above moment-magnitudes larger than mW=5.3 and no statistically significant regional difference is found for subduction and transform seismic zones. We confirm that the b-value is very different in mid-ocean ridges compared to other zones (b=1.50=B10.09 versus b=1.00=B10.05 corresponding to a power law exponent close to 1 versus 2/3) with a very high statistical confidence. We propose a physical mechanism for this, contrasting slow healing ruptures in mid-ocean ridges with fast healing ruptures in other zones. Deviations from the GPD at the very end of the tail are detected in the sample containing earthquakes from all major subduction zones (sample size of 4985 events). We propose a new statistical test of significance of such deviations based on the bootstrap method. The number of events deviating from the tails of GPD in the studied data sets (15-20 at most) is not sufficient for determining the functional form of those deviations. Thus, it is practically impossible to give preference to one of the previously suggested parametric families describing the ends of tails of seismic moment distributions.Comment: pdf document of 21 pages + 2 tables + 20 figures (ps format) + one file giving the regionalizatio

    Applications of threshold models and the weighted bootstrap for Hungarian precipitation data

    Full text link
    This paper presents applications of the peaks-over threshold methodology for both the univariate and the recently introduced bivariate case, combined with a novel bootstrap approach. We compare the proposed bootstrap methods to the more traditional profile likelihood. We have investigated 63 years of the European Climate Assessment daily precipitation data for five Hungarian grid points, first separately for the summer and winter months, then aiming at the detection of possible changes by investigating 20 years moving windows. We show that significant changes can be observed both in the univariate and the bivariate cases, the most recent period being the most dangerous, as the return levels here are the highest. We illustrate these effects by bivariate coverage regions.Comment: 10 pages, 7 figures, 5 table

    Extreme value analysis of actuarial risks: estimation and model validation

    Full text link
    We give an overview of several aspects arising in the statistical analysis of extreme risks with actuarial applications in view. In particular it is demonstrated that empirical process theory is a very powerful tool, both for the asymptotic analysis of extreme value estimators and to devise tools for the validation of the underlying model assumptions. While the focus of the paper is on univariate tail risk analysis, the basic ideas of the analysis of the extremal dependence between different risks are also outlined. Here we emphasize some of the limitation of classical multivariate extreme value theory and sketch how a different model proposed by Ledford and Tawn can help to avoid pitfalls. Finally, these theoretical results are used to analyze a data set of large claim sizes from health insurance.Comment: to appear in Advances in Statistical Analysi

    Closed-form mathematical expressions for the exponentiated Cauchy-Rayleigh distribution

    Full text link
    The Cauchy-Rayleigh (CR) distribution has been successfully used to describe asymmetric and heavy-tail events from radar imagery. Employing such model to describe lifetime data may then seem attractive, but some drawbacks arise: its probability density function does not cover non-modal behavior as well as the CR hazard rate function (hrf) assumes only one form. To outperform this difficulty, we introduce an extended CR model, called exponentiated Cauchy-Rayleigh (ECR) distribution. This model has two parameters and hrf with decreasing, decreasing-increasing-decreasing and upside-down bathtub forms. In this paper, several closed-form mathematical expressions for the ECR model are proposed: median, mode, probability weighted, log-, incomplete and order statistic moments and Fisher information matrix. We propose three estimation procedures for the ECR parameters: maximum likelihood (ML), bias corrected ML and percentile-based methods. A simulation study is done to assess the performance of estimators. An application to survival time of heart problem patients illustrates the usefulness of the ECR model. Results point out that the ECR distribution may outperform classical lifetime models, such as the gamma, Birnbaun-Saunders, Weibull and log-normal laws, before heavy-tail data.Comment: 30 page

    Modeling catastrophic deaths using EVT with a microsimulation approach to reinsurance pricing

    Full text link
    Recently, a marked Poisson process (MPP) model for life catastrophe risk was proposed in [6]. We provide a justification and further support for the model by considering more general Poisson point processes in the context of extreme value theory (EVT), and basing the choice of model on statistical tests and model comparisons. A case study examining accidental deaths in the Finnish population is provided. We further extend the applicability of the catastrophe risk model by considering small and big accidents separately; the resulting combined MPP model can flexibly capture the whole range of accidental death counts. Using the proposed model, we present a simulation framework for pricing (life) catastrophe reinsurance, based on modeling the underlying policies at individual contract level. The accidents are first simulated at population level, and their effect on a specific insurance company is then determined by explicitly simulating the resulting insured deaths. The proposed microsimulation approach can potentially lead to more accurate results than the traditional methods, and to a better view of risk, as it can make use of all the information available to the re/insurer and can explicitly accommodate even complex re/insurance terms and product features. As an example we price several excess reinsurance contracts. The proposed simulation model is also suitable for solvency assessment.Comment: 32 pages, 9 figure

    Inference on the Parameters of the Weibull Distribution Using Records

    Get PDF
    The Weibull distribution is a very applicable model for the lifetime data. In this paper, we have investigated inference on the parameters of Weibull distribution based on record values. We first propose a simple and exact test and a confidence interval for the shape parameter. Then, in addition to a generalized confidence interval, a generalized test variable is derived for the scale parameter when the shape parameter is unknown. The paper presents a simple and exact joint confidence region as well. %for the scale and shape parameters. In all cases, simulation studies show that the proposed approaches are more satisfactory and reliable than previous methods. All proposed approaches are illustrated using a real example.Comment: Accepted for publication in SOR

    LiSBOA: LiDAR Statistical Barnes Objective Analysis for optimal design of LiDAR scans and retrieval of wind statistics. Part II: Applications to synthetic and real LiDAR data of wind turbine wakes

    Full text link
    The LiDAR Statistical Barnes Objective Analysis (LiSBOA), presented in Letizia et al., is a procedure for the optimal design of LiDAR scans and calculation over a Cartesian grid of the statistical moments of the velocity field. The LiSBOA is applied to LiDAR data collected in the wake of wind turbines to reconstruct mean and turbulence intensity of the wind velocity field. The proposed procedure is firstly tested for a numerical dataset obtained by means of the virtual LiDAR technique applied to the data obtained from a large eddy simulation (LES). The optimal sampling parameters for a scanning Doppler pulsed wind LiDAR are retrieved from the LiSBOA, then the estimated statistics are calculated showing a maximum error of about 4% for both the normalized mean velocity and the turbulence intensity. Subsequently, LiDAR data collected during a field campaign conducted at a wind farm in complex terrain are analyzed through the LiSBOA for two different configurations. In the first case, the wake velocity fields of four utility-scale turbines are reconstructed on a 3D grid, showing the capability of the LiSBOA to capture complex flow features, such as high-speed jet around the nacelle and the wake turbulent shear layers. For the second case, the statistics of the wakes generated by four interacting turbines are calculated over a 2D Cartesian grid and compared to the measurements provided by the nacelle-mounted anemometers. Maximum discrepancies as low as 3% for the normalized mean velocity and turbulence intensity endorse the application of the LiSBOA for LiDAR-based wind resource assessment and diagnostic surveys for wind farms

    Optimal Trade-offs in Multi-Processor Approximate Message Passing

    Full text link
    We consider large-scale linear inverse problems in Bayesian settings. We follow a recent line of work that applies the approximate message passing (AMP) framework to multi-processor (MP) computational systems, where each processor node stores and processes a subset of rows of the measurement matrix along with corresponding measurements. In each MP-AMP iteration, nodes of the MP system and its fusion center exchange lossily compressed messages pertaining to their estimates of the input. In this setup, we derive the optimal per-iteration coding rates using dynamic programming. We analyze the excess mean squared error (EMSE) beyond the minimum mean squared error (MMSE), and prove that, in the limit of low EMSE, the optimal coding rates increase approximately linearly per iteration. Additionally, we obtain that the combined cost of computation and communication scales with the desired estimation quality according to O(log2(1/EMSE))O(\log^2(1/\text{EMSE})). Finally, we study trade-offs between the physical costs of the estimation process including computation time, communication loads, and the estimation quality as a multi-objective optimization problem, and characterize the properties of the Pareto optimal surfaces.Comment: 14 pages, 8 figure

    Multivariate Generalized Linear-statistics of short range dependent data

    Full text link
    Generalized linear (GL-) statistics are defined as functionals of an U-quantile process and unify different classes of statistics such as U-statistics and L-statistics. We derive a central limit theorem for GL-statistics of strongly mixing sequences and arbitrary dimension of the underlying kernel. For this purpose we establish a limit theorem for U-statistics and an invariance principle for U-processes together with a convergence rate for the remaining term of the Bahadur representation. An application is given by the generalized median estimator for the tail-parameter of the Pareto distribution, which is commonly used to model exceedances of high thresholds. We use subsampling to calculate confidence intervals and investigate its behaviour under independence and strong mixing in simulations

    Confidence regions for high quantiles of a heavy tailed distribution

    Full text link
    Estimating high quantiles plays an important role in the context of risk management. This involves extrapolation of an unknown distribution function. In this paper we propose three methods, namely, the normal approximation method, the likelihood ratio method and the data tilting method, to construct confidence regions for high quantiles of a heavy tailed distribution. A simulation study prefers the data tilting method.Comment: Published at http://dx.doi.org/10.1214/009053606000000416 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore