    Modélisation de la dépendance et apprentissage automatique dans le contexte du provisionnement individuel et de la solvabilité en actuariat IARD

    Les compagnies d'assurance jouent un rôle important dans l'économie des pays en s'impliquant de façon notable dans les marchés boursiers, obligataires et immobiliers, d'où la nécessité de préserver leur solvabilité. Le cycle spécifique de production en assurance amène des défis particuliers aux actuaires et aux gestionnaires de risque dans l'accomplissement de leurs tâches. Dans cette thèse, on a pour but de développer des approches et des algorithmes susceptibles d'aider à résoudre certaines problématiques liées aux opérations de provisionnement et de solvabilité d'une compagnie d'assurance. Les notions préliminaires pour ces contributions sont présentées dans l'introduction de cette thèse. Les modèles de provisionnement traditionnels sont fondés sur des informations agrégées. Ils ont connu un grand succès, comme en témoigne le nombre important d'articles et documents actuariels connexes. Cependant, en raison de la perte d'informations individuelles des sinistres, ces modèles représentent certaines limites pour fournir des estimations robustes et réalistes dans des contextes susceptibles d'évoluer. Dans ce sens, les modèles de réserve individuels représentent une alternative prometteuse. En s'inspirant des récentes recherches, on propose dans le Chapitre 1 un modèle de réserve individuel basé sur un réseau de neurones récurrent. Notre réseau a l'avantage d'être flexible pour plusieurs structures de base de données détaillés des sinistres et capable d'incorporer plusieurs informations statiques et dynamiques. À travers plusieurs études de cas avec des jeux de données simulés et réels, le réseau proposé est plus performant que le modèle agrégé chain-ladder. La détermination des exigences de capital pour un portefeuille repose sur une bonne connaissance des distributions marginales ainsi que les structures de dépendance liants les risques individuels. Dans les Chapitres 2 et 3 on s'intéresse à la modélisation de la dépendance et à l'estimation des mesures de risque. Le Chapitre 2 présente une analyse tenant compte des structures de dépendance extrême. Pour un portefeuille à deux risques, on considère en particulier à la dépendance négative extrême (antimonotonocité) qui a été moins étudiée dans la littérature contrairement à la dépendance positive extrême (comonotonocité). On développe des expressions explicites pour des mesures de risque de la somme d'une paire de variables antimontones pour trois familles de distributions. Les expressions explicites obtenues sont très utiles notamment pour quantifier le bénéfice de diversification pour des risques antimonotones. Face à une problématique avec plusieurs lignes d'affaires, plusieurs chercheurs et praticiens se sont intéressés à la modélisation en ayant recours à la théorie des copules au cours de la dernière décennie. Cette dernière fournit un outil flexible pour modéliser la structure de dépendance entre les variables aléatoires qui peuvent représenter, par exemple, des coûts de sinistres pour des contrats d'assurance. En s'inspirant des récentes recherches, dans le Chapitre 3, on définit une nouvelle famille de copules hiérarchiques. L'approche de construction proposée est basée sur une loi mélange exponentielle multivariée dont le vecteur commun est obtenu par une convolution descendante de variables aléatoires indépendantes. En se basant sur les mesures de corrélation des rangs, on propose un algorithme de détermination de la structure, tandis que l'estimation des paramètres est basée sur une vraisemblance composite. La flexibilité et l'utilité de cette famille de copules est démontrée à travers deux études de cas réelles.Insurance companies play an essential role in the countries economy by monopolizing a large part of the stock, bond, and estate markets, which implies the necessity to preserve their solvency and sustainability. However, the particular production cycle of the insurance industry may involve typical problems for actuaries and risk managers. This thesis project aims to develop approaches and algorithms that can help solve some of the reserving and solvency operations problems. The preliminary concepts for these contributions are presented in the introduction of this thesis. In current reserving practice, we use deterministic and stochastic aggregate methods. These traditional models based on aggregate information have been very successful, as evidenced by many related actuarial articles. However, due to the loss of individual claims information, these models represent some limitations in providing robust and realistic estimates, especially in variable settings. In this context, individual reserve models represent a promising alternative. Based on the recent researches, in Chapter 1, we propose an individual reserve model based on a recurrent neural network. Our network has the advantage of being flexible for several detailed claims datasets structures and incorporating several static and dynamic information. Furthermore, the proposed network outperforms the chain-ladder aggregate model through several case studies with simulated and real datasets. Determining the capital requirements for a portfolio relies on a good knowledge of the marginal distributions and the dependency structures linking the individual risks. In Chapters 2 and 3, we focus on the dependence modeling component as well as on risk measures. Chapter 2 presents an analysis taking into account extreme dependence structures. For a two-risk portfolio, we are particularly interested in extreme negative dependence (antimonotonicity), which has been less studied in the literature than extreme positive dependence (comonotonicity). We develop explicit expressions for risk measures of the sum of a pair of antimonotonic variables for three families of distributions. The explicit expressions obtained are very useful, e.g., to quantify the diversification benefit for antimonotonic risks. For a problem with several lines of business, over the last decade, several researchers and practitioners have been interested in modeling using copula theory. The latter provides a flexible tool for modeling the dependence structure between random variables that may represent, for example, claims costs for insurance contracts. Inspired by some recent researches, in Chapter 3, we define a new family of hierarchical copulas. The proposed construction approach is based on a multivariate exponential mixture distribution whose common vector is obtained by a top-down convolution of independent random variables. A structure determination algorithm is proposed based on rank correlation measures, while the parameter estimation is based on a composite likelihood. The flexibility and usefulness of this family of copulas are demonstrated through two real case studies

    d-Dimensional dependence functions and Archimax copulas

    Dependence: From classical copula modeling to neural networks

    The development of tools to measure and to model dependence in high-dimensional data is of great interest in a wide range of applications including finance, risk management, bioinformatics and environmental sciences. The copula framework, which allows us to extricate the underlying dependence structure of any multivariate distribution from its univariate marginals, has garnered growing popularity over the past few decades. Within the broader context of this framework, we develop several novel statistical methods and tools for analyzing, interpreting and modeling dependence. In the first half of this thesis, we advance classical copula modeling by introducing new dependence measures and parametric dependence models. To that end, we propose a framework for quantifying dependence between random vectors. Using the notion of a collapsing function, we summarize random vectors by single random variables, referred to as collapsed random variables. In the context of this collapsing function framework, we develop various tools to characterize the dependence between random vectors including new measures of association computed from the collapsed random variables, asymptotic results required to construct confidence intervals for these measures, collapsed copulas to analytically summarize the dependence for certain collapsing functions and a graphical assessment of independence between groups of random variables. We explore several suitable collapsing functions in theoretical and empirical settings. To showcase tools derived from our framework, we present data applications in bioinformatics and finance. Furthermore, we contribute to the growing literature on parametric copula modeling by generalizing the class of Archimax copulas (AXCs) to hierarchical Archimax copulas (HAXCs). AXCs are typically used to model the dependence at non-extreme levels while accounting for any asymptotic dependence between extremes. HAXCs then enhance the flexibility of AXCs by their ability to model partial asymmetries. We explore two ways of inducing hierarchies. Furthermore, we present various examples of HAXCs along with their stochastic representations, which are used to establish corresponding sampling algorithms. While the burgeoning research on the construction of parametric copulas has yielded some powerful tools for modeling dependence, the flexibility of these models is already limited in moderately high dimensions and they can often fail to adequately characterize complex dependence structures that arise in real datasets. In the second half of this thesis, we explore utilizing generative neural networks instead of parametric dependence models. In particular, we investigate the use of a type of generative neural network known as the generative moment matching network (GMMN) for two critical dependence modeling tasks. First, we demonstrate how GMMNs can be utilized to generate quasi-random samples from a large variety of multivariate distributions. These GMMN quasi-random samples can then be used to obtain low-variance estimates of quantities of interest. Compared to classical parametric copula methods for multivariate quasi-random sampling, GMMNs provide a more flexible and universal approach. Moreover, we theoretically and numerically corroborate the variance reduction capabilities of GMMN randomized quasi-Monte Carlo estimators. Second, we propose a GMMN--GARCH approach for modeling dependent multivariate time series, where ARMA--GARCH models are utilized to capture the temporal dependence within each univariate marginal time series and GMMNs are used to model the underlying cross-sectional dependence. If the number of marginal time series is large, we embed an intermediate dimension reduction step within our framework. The primary objective of our proposed approach is to produce empirical predictive distributions (EPDs), also known as probabilistic forecasts. In turn, these EPDs are also used to forecast certain risk measures, such as value-at-risk. Furthermore, in the context of modeling yield curves and foreign exchange rate returns, we show that the flexibility of our GMMN--GARCH models leads to better EPDs and risk-measure forecasts, compared to classical copula--GARCH models