5 research outputs found

    Lapse risk modelling in insurance: a Bayesian mixture approach

    Full text link
    This paper focuses on modelling surrender time for policyholders in the context of life insurance. In this setup, a large lapse rate at the first months of a contract is often observed, with a decrease in this rate after some months. The modelling of the time to cancellation must account for this specific behaviour. Another stylised fact is that policies which are not cancelled in the study period are considered censored. To account for both censuring and heterogeneous lapse rates, this work assumes a Bayesian survival model with a mixture of regressions. The inference is based on data augmentation allowing for fast computations even for data sets of over a million clients. Moreover, scalable point estimation based on EM algorithm is also presented. An illustrative example emulates a typical behaviour for life insurance contracts and a simulated study investigates the properties of the proposed model. In particular, the observed censuring in the insurance context might be up to 50% of the data, which is very unusual for survival models in other fields such as epidemiology. This aspect is exploited in our simulated study

    BayesMortalityPlus: A package in R for Bayesian graduation of mortality modelling

    Full text link
    The BayesMortalityPlus package provides a framework for modelling and predicting mortality data. The package includes tools for the construction of life tables based on Heligman-Pollard laws, and also on dynamic linear smoothers. Flexibility is available in terms of modelling so that the response variable may be modeled as Poisson, Binomial or Gaussian. If temporal data is available, the package provides a Bayesian implementation for the well-known Lee-Carter model that allows for estimation, projection of mortality over time, and assessment of uncertainty of any linear or nonlinear function of parameters such as life expectancy. Illustrations are considered to show the capability of the proposed package to model mortality data

    Bayesian cross-validation of geostatistical models

    Get PDF
    The problem of validating or criticizing models for georeferenced data is challenging as much as conclusions may be sensitive to the partition of data into training and validation cases. This is an obvious issue related to the basic validation scheme which selects a subset of the data to leave out of estimation and to make predictions with an assumed model. In this setup, only a few out-of-sample locations are usually selected to validate the model. On the other hand, the cross-validation approach, which considers several possible configurations of data divided into training and validation observations, is an appealing alternative, but it could be computationally demanding as the estimation of parameters usually requires computationally intensive methods. The purpose of this work is to use cross-validation techniques to choose between competing models and to assess the goodness of fit of spatial models in different regions of the spatial domain. We consider the sampling design for selecting the training and validation sets by assigning a probability distribution to the possible data partitions. To deal with the computational burden of cross-validation, we estimate discrepancy functions in a computationally efficient manner based on the importance weighting of posterior samples. Furthermore, we propose a stratified cross-validation scheme to take into account spatial heterogeneity, reducing the total variance of estimated predictive discrepancy measures. We also illustrate the advantages of our proposal with simulated examples of homogeneous and inhomogeneous spatial processes and with an application to rainfall dataset in Rio de Janeiro. The purpose of this work is to use cross-validation techniques to choose between competing models and to assess the goodness of fit of spatial models in different regions of the spatial domain. We consider the sampling design for selecting the training and validation sets by assigning a probability distribution to the possible data partitions. To deal with the computational burden of cross-validation, we estimate discrepancy functions in a computationally efficient manner based on the importance weighting of posterior samples. Furthermore, we propose a stratified cross-validation scheme to take into account spatial heterogeneity, reducing the total variance of estimated predictive discrepancy measures. We also illustrate the advantages of our proposal with simulated examples of homogeneous and inhomogeneous spatial processes and with an application to rainfall dataset in Rio de Janeiro

    NEOTROPICAL XENARTHRANS: a data set of occurrence of xenarthran species in the Neotropics

    No full text
    Xenarthrans—anteaters, sloths, and armadillos—have essential functions for ecosystem maintenance, such as insect control and nutrient cycling, playing key roles as ecosystem engineers. Because of habitat loss and fragmentation, hunting pressure, and conflicts with domestic dogs, these species have been threatened locally, regionally, or even across their full distribution ranges. The Neotropics harbor 21 species of armadillos, 10 anteaters, and 6 sloths. Our data set includes the families Chlamyphoridae (13), Dasypodidae (7), Myrmecophagidae (3), Bradypodidae (4), and Megalonychidae (2). We have no occurrence data on Dasypus pilosus (Dasypodidae). Regarding Cyclopedidae, until recently, only one species was recognized, but new genetic studies have revealed that the group is represented by seven species. In this data paper, we compiled a total of 42,528 records of 31 species, represented by occurrence and quantitative data, totaling 24,847 unique georeferenced records. The geographic range is from the southern United States, Mexico, and Caribbean countries at the northern portion of the Neotropics, to the austral distribution in Argentina, Paraguay, Chile, and Uruguay. Regarding anteaters, Myrmecophaga tridactyla has the most records (n = 5,941), and Cyclopes sp. have the fewest (n = 240). The armadillo species with the most data is Dasypus novemcinctus (n = 11,588), and the fewest data are recorded for Calyptophractus retusus (n = 33). With regard to sloth species, Bradypus variegatus has the most records (n = 962), and Bradypus pygmaeus has the fewest (n = 12). Our main objective with Neotropical Xenarthrans is to make occurrence and quantitative data available to facilitate more ecological research, particularly if we integrate the xenarthran data with other data sets of Neotropical Series that will become available very soon (i.e., Neotropical Carnivores, Neotropical Invasive Mammals, and Neotropical Hunters and Dogs). Therefore, studies on trophic cascades, hunting pressure, habitat loss, fragmentation effects, species invasion, and climate change effects will be possible with the Neotropical Xenarthrans data set. Please cite this data paper when using its data in publications. We also request that researchers and teachers inform us of how they are using these data
    corecore