107 research outputs found

    Modelling dependence between frequency and severity of insurance claims

    Get PDF
    Mestrado em Actuarial ScienceA estimação da perda individual é uma importante tarefa para calcular os preços das apólices de seguro. A abordagem padrão assume independência entre a frequência e a severidade dos sinistros, o que pode não ser uma suposição realística. Neste texto, a dependência entre números e montantes de sinistros é explorada, num contexto de Modelos Lineares Generalizados. Um modelo de severidade condicional e um modelo de Cópula são apresentados como alternativas para modelar esta dependência e posteriormente aplicados a um conjunto de dados fornecido por uma seguradora portuguesa. No final, a comparação com o cenário de independência é realizada.The estimation of the individual loss is an important task to price insurance policies. The standard approach assumes independence between claim frequency and severity, which may not be a realistic assumption. In this text, the dependence between claim counts and claim sizes is explored, in a Generalized Linear Model framework. A Conditional severity model and a Copula model are presented as alternatives to model this dependence and later applied to a data set provided by a Portuguese insurance company. At the end, the comparison with the independence scenario is carried out.info:eu-repo/semantics/publishedVersio

    An overview of approaches to insurance data analysis and suggestions for warranty data analysis

    Get PDF
    Warranty shares similarities with insurance in many aspects. Research on insurance data analysis has attracted much more attention than on warranty data analysis. This paper provides a general comparison between warranty and insurance in terms of their coverages, policies and data collection. It then reviews existing approaches to insurance data analysis with regard to modelling of claim frequency, modelling of claim size and policy pricing. Some recent patents relating statistical models are also discussed. The paper concludes with suggestions for improving warranty data analysis

    Boosting insights in insurance tariff plans with tree-based machine learning methods

    Full text link
    Pricing actuaries typically operate within the framework of generalized linear models (GLMs). With the upswing of data analytics, our study puts focus on machine learning methods to develop full tariff plans built from both the frequency and severity of claims. We adapt the loss functions used in the algorithms such that the specific characteristics of insurance data are carefully incorporated: highly unbalanced count data with excess zeros and varying exposure on the frequency side combined with scarce, but potentially long-tailed data on the severity side. A key requirement is the need for transparent and interpretable pricing models which are easily explainable to all stakeholders. We therefore focus on machine learning with decision trees: starting from simple regression trees, we work towards more advanced ensembles such as random forests and boosted trees. We show how to choose the optimal tuning parameters for these models in an elaborate cross-validation scheme, we present visualization tools to obtain insights from the resulting models and the economic value of these new modeling approaches is evaluated. Boosted trees outperform the classical GLMs, allowing the insurer to form profitable portfolios and to guard against potential adverse risk selection

    Pricing financial and insurance products in the multivariate setting

    Get PDF
    In finance and insurance there is often the need to construct multivariate distributions to take into account more than one source of risk, where such risks cannot be assumed to be independent. In the course of this thesis we are going to explore three models, namely the copula models, the trivariate reduction scheme and mixtures as candidate models for capturing the dependence between multiple sources of risk. This thesis contains results of three different projects. The first one is in financial mathematics, more precisely on the pricing of financial derivatives (multi-asset options) which depend on multiple underlying assets, where we construct the dependence between such assets using copula models and the trivariate reduction scheme. The second and the third projects are in actuarial mathematics, more specifically on the pricing of the premia that need to be paid by policyholders in the automobile insurance when more than one type of claim is considered. We do the pricing including all the information available about the characteristics of the policyholders and their cars (i.e. a priori ratemaking) and about the numbers of claims per type in which the policyholders have been involved (i.e. a posteriori ratemaking). In both projects we model the dependence between the multiple types of claims using mixture distributions/regression models: we consider the different types of claims to be modelled in terms of their own distribution/regression model but with a common heterogeneity factor which follows a mixing distribution/regression model that is responsible for the dependence between the multiple types of claims. In the second project we present a new model (i.e. the bivariate Negative Binomial-Inverse Gaussian regression model) and in the third one we present a new family of models (i.e. the bivariate mixed Poisson regression models with varying dispersion), both as suitable alternatives to the classically used bivariate mixed Poisson regression models

    An investigation of estimation performance for a multivariate Poisson-gamma model with parameter dependency

    Get PDF
    Statistical analysis can be overly reliant on naive assumptions of independence between different data generating processes. This results in having greater uncertainty when estimating underlying characteristics of processes as dependency creates an opportunity to boost sample size by incorporating more data into the analysis. However, this assumes that dependency has been appropriately specified, as mis-specified dependency can provide misleading information from the data. The main aim of this research is to investigate the impact of incorporating dependency into the data analysis. Our motivation for this work is concerned with estimating the reliability of items and as such we have restricted our investigation to study homogeneous Poisson processes (HPP), which can be used to model the rate of occurrence of events such as failures. In an HPP, dependency between rates can occur for numerous reasons. Whether it is similarity in mechanical designs, failure occurrence due to a common management culture or comparable failure count across machines for same failure modes. Multiple types of dependencies are considered. Dependencies can take different forms, such as simple linear dependency measured through the Pearson correlation, rank dependencies which capture non-linear dependencies and tail dependencies where the strength of the dependency may be stronger in extreme events as compared to more moderate one. The estimation of the measure of dependency between correlated processes can be challenging. We develop the research grounded in a Bayes or empirical Bayes inferential framework, where uncertainty in the actual rate of occurrence of a process is modelled with a prior probability distribution. We consider prior distributions to belong to the Gamma distribution given its flexibility and mathematical association with the Poisson process. For dependency modelling between processes we consider copulas which are a convenient and flexible way of capturing a variety of different dependency characteristics between distributions. We use a multivariate Poisson – Gamma probability model. The Poisson process captures aleatory uncertainty, the inherent variability in the data. Whereas the Gamma prior describes the epistemic uncertainty. By pooling processes with correlated underlying mean rate we are able to incorporate data from these processes into the inferential process and reduce the estimation error. There are three key research themes investigated in this thesis. First, to investigate the value in reducing estimation error by incorporating dependency within the analysis via theoretical analysis and simulation experiments. We show that correctly accounting for dependency can significantly reduce the estimation error. The findings should inform analysts a priori as to whether it is worth pursuing a more complex analysis for which the dependency parameter needs to be elicited. Second, to examine the consequences of mis-specifying the degree and form of dependency through controlled simulation experiments. We show the relative robustness of different ways of modelling the dependency using copula and Bayesian methods. The findings should inform analysts about the sensitivity of modelling choices. Third, to show how we can operationalise different methods for representing dependency through an industry case study. We show the consequences for a simple decision problem associated with the provision of spare parts to maintain operation of the industry process when depenency between event rates of the machines is appropriately modelled rather than being treated as independent processes.Statistical analysis can be overly reliant on naive assumptions of independence between different data generating processes. This results in having greater uncertainty when estimating underlying characteristics of processes as dependency creates an opportunity to boost sample size by incorporating more data into the analysis. However, this assumes that dependency has been appropriately specified, as mis-specified dependency can provide misleading information from the data. The main aim of this research is to investigate the impact of incorporating dependency into the data analysis. Our motivation for this work is concerned with estimating the reliability of items and as such we have restricted our investigation to study homogeneous Poisson processes (HPP), which can be used to model the rate of occurrence of events such as failures. In an HPP, dependency between rates can occur for numerous reasons. Whether it is similarity in mechanical designs, failure occurrence due to a common management culture or comparable failure count across machines for same failure modes. Multiple types of dependencies are considered. Dependencies can take different forms, such as simple linear dependency measured through the Pearson correlation, rank dependencies which capture non-linear dependencies and tail dependencies where the strength of the dependency may be stronger in extreme events as compared to more moderate one. The estimation of the measure of dependency between correlated processes can be challenging. We develop the research grounded in a Bayes or empirical Bayes inferential framework, where uncertainty in the actual rate of occurrence of a process is modelled with a prior probability distribution. We consider prior distributions to belong to the Gamma distribution given its flexibility and mathematical association with the Poisson process. For dependency modelling between processes we consider copulas which are a convenient and flexible way of capturing a variety of different dependency characteristics between distributions. We use a multivariate Poisson – Gamma probability model. The Poisson process captures aleatory uncertainty, the inherent variability in the data. Whereas the Gamma prior describes the epistemic uncertainty. By pooling processes with correlated underlying mean rate we are able to incorporate data from these processes into the inferential process and reduce the estimation error. There are three key research themes investigated in this thesis. First, to investigate the value in reducing estimation error by incorporating dependency within the analysis via theoretical analysis and simulation experiments. We show that correctly accounting for dependency can significantly reduce the estimation error. The findings should inform analysts a priori as to whether it is worth pursuing a more complex analysis for which the dependency parameter needs to be elicited. Second, to examine the consequences of mis-specifying the degree and form of dependency through controlled simulation experiments. We show the relative robustness of different ways of modelling the dependency using copula and Bayesian methods. The findings should inform analysts about the sensitivity of modelling choices. Third, to show how we can operationalise different methods for representing dependency through an industry case study. We show the consequences for a simple decision problem associated with the provision of spare parts to maintain operation of the industry process when depenency between event rates of the machines is appropriately modelled rather than being treated as independent processes

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio
    corecore