Search CORE

123 research outputs found

Pair-copula constructions of multiple dependence

Author: Aas Kjersti
Bakken Henrik
Czado Claudia
Frigessi Arnoldo
Publication venue
Publication date: 01/01/2006
Field of study

Building on the work of Bedford, Cooke and Joe, we show how multivariate data, which exhibit complex patterns of dependence in the tails, can be modelled using a cascade of pair-copulae, acting on two variables at a time. We use the pair-copula decomposition of a general multivariate distribution and propose a method to perform inference. The model construction is hierarchical in nature, the various levels corresponding to the incorporation of more variables in the conditioning sets, using pair-copulae as simple building blocs. Pair-copula decomposed models also represent a very flexible way to construct higher-dimensional coplulae. We apply the methodology to a financial data set. Our approach represents the first step towards developing of an unsupervised algorithm that explores the space of possible pair-copula models, that also can be applied to huge data sets automatically

Open Access LMU

Models for construction of multivariate dependence

Author: Aas Kjersti
Berg Daniel
Publication venue: Matematisk Institutt, Universitetet i Oslo
Publication date: 01/01/2007
Field of study

In this article we review models for construction of higher-dimensional dependence that have arisen recent years. A multivariate data set, which exhibit complex patterns of dependence, particularly in the tails, can be modelled using a cascade of lower-dimensional copulae. We examine two such models that differ in their construction of the dependency structure, namely the nested Archimedean constructions and the pair-copula constructions (also referred to as vines). The constructions are compared, and estimation- and simulation techniques are examined. The fit of the two constructions is tested on two different four-dimensional data sets; precipitation values and equity returns, using a state of the art copula goodness-of-fit procedure. The nested Archimedean construction is strongly rejected for both our data sets, while the pair-copula construction provides an appropriate fit. Through VaR calculations, we show that the latter does not overfit data, but works very well even out-of-sample

NORA - Norwegian Open Research Archives

Learning Latent Representations of Bank Customers With The Variational Autoencoder

Author: Aas Kjersti
Jenssen Robert
Kampffmeyer Michael
Mancisidor Rogelio A
Publication venue
Publication date: 14/03/2019
Field of study

Learning data representations that reflect the customers' creditworthiness can improve marketing campaigns, customer relationship management, data and process management or the credit risk assessment in retail banks. In this research, we adopt the Variational Autoencoder (VAE), which has the ability to learn latent representations that contain useful information. We show that it is possible to steer the latent representations in the latent space of the VAE using the Weight of Evidence and forming a specific grouping of the data that reflects the customers' creditworthiness. Our proposed method learns a latent representation of the data, which shows a well-defied clustering structure capturing the customers' creditworthiness. These clusters are well suited for the aforementioned banks' activities. Further, our methodology generalizes to new customers, captures high-dimensional and complex financial data, and scales to large data sets.Comment: arXiv admin note: substantial text overlap with arXiv:1806.0253

arXiv.org e-Print Archive

Munin - Open Research Archive

Deep Generative Models for Reject Inference in Credit Scoring

Author: Aas Kjersti
Jenssen Robert
Kampffmeyer Michael
Mancisidor Rogelio A.
Publication venue
Publication date: 12/04/2019
Field of study

Credit scoring models based on accepted applications may be biased and their consequences can have a statistical and economic impact. Reject inference is the process of attempting to infer the creditworthiness status of the rejected applications. In this research, we use deep generative models to develop two new semi-supervised Bayesian models for reject inference in credit scoring, in which we model the data generating process to be dependent on a Gaussian mixture. The goal is to improve the classification accuracy in credit scoring models by adding reject applications. Our proposed models infer the unknown creditworthiness of the rejected applications by exact enumeration of the two possible outcomes of the loan (default or non-default). The efficient stochastic gradient optimization technique used in deep generative models makes our models suitable for large data sets. Finally, the experiments in this research show that our proposed models perform better than classical and alternative machine learning models for reject inference in credit scoring

arXiv.org e-Print Archive

Munin - Open Research Archive

Multimodal Generative Models for Bankruptcy Prediction Using Textual Data

Author: Aas Kjersti
Mancisidor Rogelio A.
Publication venue
Publication date: 24/02/2024
Field of study

Textual data from financial filings, e.g., the Management's Discussion & Analysis (MDA) section in Form 10-K, has been used to improve the prediction accuracy of bankruptcy models. In practice, however, we cannot obtain the MDA section for all public companies, which limits the use of MDA data in traditional bankruptcy models, as they need complete data to make predictions. The two main reasons for the lack of MDA are: (i) not all companies are obliged to submit the MDA and (ii) technical problems arise when crawling and scrapping the MDA section. To solve this limitation, this research introduces the Conditional Multimodal Discriminative (CMMD) model that learns multimodal representations that embed information from accounting, market, and textual data modalities. The CMMD model needs a sample with all data modalities for model training. At test time, the CMMD model only needs access to accounting and market modalities to generate multimodal representations, which are further used to make bankruptcy predictions and to generate words from the missing MDA modality. With this novel methodology, it is realistic to use textual data in bankruptcy prediction models, since accounting and market data are available for all companies, unlike textual data. The empirical results of this research show that if financial regulators, or investors, were to use traditional models using MDA data, they would only be able to make predictions for 60% of the companies. Furthermore, the classification performance of our proposed methodology is superior to that of a large number of traditional classifier models, taking into account all the companies in our sample

arXiv.org e-Print Archive

Pair-copula constructions of multiple dependence

Author: Aas Kjersti
Bakken Henrik
Czado Claudia
Frigessi Arnoldo
Publication venue
Publication date: 01/01/2006
Field of study

Discriminative Multimodal Learning via Conditional Priors in Generative Models

Author: Aas Kjersti
Jenssen Robert
Kampffmeyer Michael
Mancisidor Rogelio A.
Publication venue
Publication date: 21/01/2023
Field of study

Deep generative models with latent variables have been used lately to learn joint representations and generative processes from multi-modal data. These two learning mechanisms can, however, conflict with each other and representations can fail to embed information on the data modalities. This research studies the realistic scenario in which all modalities and class labels are available for model training, but where some modalities and labels required for downstream tasks are missing. We show, in this scenario, that the variational lower bound limits mutual information between joint representations and missing modalities. We, to counteract these problems, introduce a novel conditional multi-modal discriminative model that uses an informative prior distribution and optimizes a likelihood-free objective function that maximizes mutual information between joint representations and missing modalities. Extensive experimentation demonstrates the benefits of our proposed model, empirical results show that our model achieves state-of-the-art results in representative problems such as downstream classification, acoustic inversion, and image and annotation generation

arXiv.org e-Print Archive