23 research outputs found

    Modelling Censored Losses Using Splicing: a Global Fit Strategy With Mixed Erlang and Extreme Value Distributions

    Full text link
    In risk analysis, a global fit that appropriately captures the body and the tail of the distribution of losses is essential. Modelling the whole range of the losses using a standard distribution is usually very hard and often impossible due to the specific characteristics of the body and the tail of the loss distribution. A possible solution is to combine two distributions in a splicing model: a light-tailed distribution for the body which covers light and moderate losses, and a heavy-tailed distribution for the tail to capture large losses. We propose a splicing model with a mixed Erlang (ME) distribution for the body and a Pareto distribution for the tail. This combines the flexibility of the ME distribution with the ability of the Pareto distribution to model extreme values. We extend our splicing approach for censored and/or truncated data. Relevant examples of such data can be found in financial risk analysis. We illustrate the flexibility of this splicing model using practical examples from risk measurement

    Sparse Regression with Multi-type Regularized Feature Modeling

    Full text link
    Within the statistical and machine learning literature, regularization techniques are often used to construct sparse (predictive) models. Most regularization strategies only work for data where all predictors are treated identically, such as Lasso regression for (continuous) predictors treated as linear effects. However, many predictive problems involve different types of predictors and require a tailored regularization term. We propose a multi-type Lasso penalty that acts on the objective function as a sum of subpenalties, one for each type of predictor. As such, we allow for predictor selection and level fusion within a predictor in a data-driven way, simultaneous with the parameter estimation process. We develop a new estimation strategy for convex predictive models with this multi-type penalty. Using the theory of proximal operators, our estimation procedure is computationally efficient, partitioning the overall optimization problem into easier to solve subproblems, specific for each predictor type and its associated penalty. Earlier research applies approximations to non-differentiable penalties to solve the optimization problem. The proposed SMuRF algorithm removes the need for approximations and achieves a higher accuracy and computational efficiency. This is demonstrated with an extensive simulation study and the analysis of a case-study on insurance pricing analytics

    Estimating the maximum possible earthquake magnitude using extreme value methodology: the Groningen case

    Get PDF
    The area-characteristic, maximum possible earthquake magnitude TMT_M is required by the earthquake engineering community, disaster management agencies and the insurance industry. The Gutenberg-Richter law predicts that earthquake magnitudes MM follow a truncated exponential distribution. In the geophysical literature several estimation procedures were proposed, see for instance Kijko and Singh (Acta Geophys., 2011) and the references therein. Estimation of TMT_M is of course an extreme value problem to which the classical methods for endpoint estimation could be applied. We argue that recent methods on truncated tails at high levels (Beirlant et al., Extremes, 2016; Electron. J. Stat., 2017) constitute a more appropriate setting for this estimation problem. We present upper confidence bounds to quantify uncertainty of the point estimates. We also compare methods from the extreme value and geophysical literature through simulations. Finally, the different methods are applied to the magnitude data for the earthquakes induced by gas extraction in the Groningen province of the Netherlands

    Extreme Value Theory in Finance and Insurance

    No full text
    When modelling high-dimensional data, dimension reduction techniques such as principal component analysis are often used. In the first part of this thesis we will focus on two drawbacks of classical PCA. First, interpretation of classical PCA is often challenging because most of the loadings are neither very small nor very large in absolute value. Second, classical PCA can be heavily distorted by outliers since it is based on the classical covariance matrix. In order to resolve both problems, we present a new PCA algorithm that is robust against outliers and yields sparse PCs, i.e. PCs with many zero loadings. The approach is based on the ROBPCA algorithm that generates robust but non-sparse loadings. The construction of the new ROSPCA method is detailed, as well as a selection criterion for the sparsity parameter. An extensive simulation study and a real data example are performed, showing that it is capable of accurately finding the sparse structure of datasets, even when challenging outliers are present. Stock market crashes such as Black Monday in 1987 and catastrophes such as earthquakes are examples of extreme events in finance and insurance, respectively. They are large events with a considerable impact that occur seldom. Extreme value theory (EVT) provides a theoretical framework to model extreme values such that e.g. risk measures can be estimated based on available data. In the second part of this PhD thesis we focus on applications of EVT that are of interest to finance and insurance. A Black Swan is an improbable event with massive consequences. We propose a way to investigate if the 2007-2008 financial crisis was a Black Swan event for a given bank based on weekly log-returns. This is done by comparing the tail behaviour of the negative log-returns before and after the crisis using techniques from extreme value methodology. We illustrate this approach with Barclays and Credit Suisse data, and then link the differences in tail risk behaviour between these banks with economic indicators. The earthquake engineering community, disaster management agencies and the insurance industry need models for earthquake magnitudes to predict possible damage by earthquakes. A crucial element in these models is the area-characteristic, maximum possible earthquake magnitude. The Gutenberg-Richter distribution, which is a (doubly) truncated exponential distribution, is widely used to model earthquake magnitudes. Recently, Aban et al. (2006) and Beirlant et al. (2016) discussed tail fitting for truncated Pareto-type distributions. However, as is the case for the Gutenberg-Richter distribution, in some applications the underlying distribution appears to have a lighter tail than the Pareto distribution. We generalise the classical peaks over threshold (POT) approach to allow for truncation effects. This enables a unified treatment of extreme value analysis for truncated heavy and light tails. We use a pseudo maximum likelihood approach to estimate the model parameters and consider extreme quantile estimation. The new approach is illustrated on examples from hydrology and geophysics. Moreover, we perform simulations to illustrate the potential of the method on truncated heavy and light tails. The new approach can then be used to estimate the maximum possible earthquake magnitude. We also look at two other EVT-based endpoint estimators and endpoint estimators that are used in the geophysical literature. To quantify uncertainty of the point estimates for the endpoint, upper confidence bounds are also considered. We apply the techniques to provide estimates, and upper confidence bounds, for the maximum possible earthquake magnitude in Groningen where earthquakes are induced by gas extraction. Furthermore, we compare the methods from extreme value theory and the geophysical literature through simulations. In risk analysis, a global fit that appropriately captures the body and the tail of the distribution of losses is essential. Modelling the whole range of the losses using a standard distribution is usually very hard and often impossible due to the specific characteristics of the body and the tail of the loss distribution. A possible solution is to combine two distributions in a splicing model: a light-tailed distribution for the body which covers light and moderate losses, and a heavy-tailed distribution for the tail to capture large losses. We propose a splicing model with the flexible mixed Erlang distribution for the body and a Pareto distribution for the tail. Motivated by examples in financial risk analysis, we extend our splicing approach to censored and/or truncated data. We illustrate the flexibility of this splicing model using practical examples from reinsurance.status: publishe

    Actuarial and statistical aspects of reinsurance in R

    No full text
    status: accepte

    Sparse regression with Multi-type Regularized Feature modeling

    No full text
    status: accepte

    Fitting tails affected by truncation

    No full text

    Fitting tails affected by truncation

    No full text
    © 2017, Institute of Mathematical Statistics. All rights reserved. In several applications, ultimately at the largest data, truncation effects can be observed when analysing tail characteristics of statistical distributions. In some cases truncation effects are forecasted through physical models such as the Gutenberg-Richter relation in geophysics, while at other instances the nature of the measurement process itself may cause under recovery of large values, for instance due to flooding in river discharge readings. Recently, Beirlant, Fraga Alves and Gomes (2016) discussed tail fitting for truncated Pareto-type distributions. Using examples from earthquake analysis, hydrology and diamond valuation we demonstrate the need for a unified treatment of extreme value analysis for truncated heavy and light tails. We generalise the classical Peaks over Threshold approach for the different max-domains of attraction with shape parameter ξ > −1/2 to allow for truncation effects. We use a pseudo maximum likelihood approach to estimate the model parameters and consider extreme quantile estimation and reconstruction of quantile levels before truncation whenever appropriate. We report on some simulation experiments and provide some basic asymptotic results.status: publishe

    Sparse PCA for high-dimensional data with outliers

    No full text
    © 2016 American Statistical Association and the American Society for Quality. A new sparse PCA algorithm is presented, which is robust against outliers. The approach is based on the ROBPCA algorithm that generates robust but nonsparse loadings. The construction of the new ROSPCA method is detailed, as well as a selection criterion for the sparsity parameter. An extensive simulation study and a real data example are performed, showing that it is capable of accurately finding the sparse structure of datasets, even when challenging outliers are present. In comparison with a projection pursuit-based algorithm, ROSPCA demonstrates superior robustness properties and comparable sparsity estimation capability, as well as significantly faster computation time.peerreview_statement: The publishing and review policy for this title is described in its Aims & Scope. aims_and_scope_url: http://www.tandfonline.com/action/journalInformation?show=aimsScope&journalCode=utch20status: publishe
    corecore