1,158 research outputs found

    Multi-Period Credit Default Prediction - A Survival Analysis Approach

    Get PDF
    The book deals with the problem to estimate credit default probabilities under a flexible multi-period prediction horizon. Multi-period predictions are naturally desirable because the maturity of loans usually spans several periods. However, single-period models largely prevail in the literature so far due to their simplicity. Predicting over multiple periods indeed entails certain challenges that do not arise within a single-period view. Among the main contributions of this work to the literature is to show that there are relatively simple solutions to these challenges available. From a methodological point of view, a survival analysis approach is used. In a survival analysis context, the time until default (or lifetime) is the central variable under investigation as opposed to the traditional approach of reducing the information to a binary variable representing the default event. Modeling the time until default has the advantage that both the timing of default events and censored data are utilized. Since both issues gain importance as the prediction horizon grows it is no coincidence that a survival analysis approach is selected for the multi-period prediction problem. The main results of the work are the following. First, a new index for measuring the predictive accuracy of default predictions is proposed and its advantages over commonly used indices are shown both theoretically and by an empirical analysis. This is part of the second chapter which further includes new methods of statistical inference for the new index. In the third chapter, default prediction models for the case of panel datasets with time-varying covariates are dealt with. A new approach is developed that is simpler than the models available in the literature so far. In an empirical study concerning North American public firms, we provide evidence that the proposed approach delivers more accurate predictions than its competitors as well. In the final chapter, the problem of assigning default probability estimates to given rating grades is examined. If default events are rare, standard approaches have certain drawbacks. As an alternative, an empirical Bayes approach is presented that mitigates the effects of data sparseness. The new estimator is applied to a comprehensive sample of sovereign bonds. Among the main findings of the empirical part is that capital requirements for sovereign bonds are likely to be underestimated by using standard approaches but not when using the empirical Bayes estimator

    ISBIS 2016: Meeting on Statistics in Business and Industry

    Get PDF
    This Book includes the abstracts of the talks presented at the 2016 International Symposium on Business and Industrial Statistics, held at Barcelona, June 8-10, 2016, hosted at the Universitat Politècnica de Catalunya - Barcelona TECH, by the Department of Statistics and Operations Research. The location of the meeting was at ETSEIB Building (Escola Tecnica Superior d'Enginyeria Industrial) at Avda Diagonal 647. The meeting organizers celebrated the continued success of ISBIS and ENBIS society, and the meeting draw together the international community of statisticians, both academics and industry professionals, who share the goal of making statistics the foundation for decision making in business and related applications. The Scientific Program Committee was constituted by: David Banks, Duke University Amílcar Oliveira, DCeT - Universidade Aberta and CEAUL Teresa A. Oliveira, DCeT - Universidade Aberta and CEAUL Nalini Ravishankar, University of Connecticut Xavier Tort Martorell, Universitat Politécnica de Catalunya, Barcelona TECH Martina Vandebroek, KU Leuven Vincenzo Esposito Vinzi, ESSEC Business Schoo

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    Interpretable statistics for complex modelling: quantile and topological learning

    Get PDF
    As the complexity of our data increased exponentially in the last decades, so has our need for interpretable features. This thesis revolves around two paradigms to approach this quest for insights. In the first part we focus on parametric models, where the problem of interpretability can be seen as a “parametrization selection”. We introduce a quantile-centric parametrization and we show the advantages of our proposal in the context of regression, where it allows to bridge the gap between classical generalized linear (mixed) models and increasingly popular quantile methods. The second part of the thesis, concerned with topological learning, tackles the problem from a non-parametric perspective. As topology can be thought of as a way of characterizing data in terms of their connectivity structure, it allows to represent complex and possibly high dimensional through few features, such as the number of connected components, loops and voids. We illustrate how the emerging branch of statistics devoted to recovering topological structures in the data, Topological Data Analysis, can be exploited both for exploratory and inferential purposes with a special emphasis on kernels that preserve the topological information in the data. Finally, we show with an application how these two approaches can borrow strength from one another in the identification and description of brain activity through fMRI data from the ABIDE project

    Statistical tools for assessment of spatial properties of mutations observed under the microarray platform

    Get PDF
    Mutations are alterations of the DNA nucleotide sequence of the genome. Analyses of spatial properties of mutations are critical for understanding certain mutational mechanisms relevant to genetic disease, diversity, and evolution. The studies in this thesis focus on two types of mutations: point mutations, i.e., single nucleotide polymorphism (SNP) genotype differences, and mutations in segments, i.e., copy number variations (CNVs). The microarray platform, such as the Mouse Diversity Genotyping Array (MDGA), detects these mutations genome-wide with lower cost compared to whole genome sequencing, and thus is considered for suitability as a screening tool for large populations. Yet it provides observation of mutations with high degree of missingness across the genome due to its design, which thus leads to challenges for statistical analyses. Three topics are studied in this thesis: the development of formal statistical tools for detecting the existence of point mutation clusters under the microarray platform; the evaluation of the performance of test statistics developed while accounting for various probe designs, in terms of the capabilities of detecting mutation clusters; the development of formal statistical tools for testing the existence of spatial association between point mutations and mutations in segments. Statistical models such as Poisson point processes and Neyman-Scott processes are used for the distributions of the locations of point mutations under null and alternative hypotheses. Monte Carlo frameworks are established for statistical inference and the evaluation of power performance of the proposed test statistics. Tests with desirable performance are identified and recommended as screening tools. These statistical tools can be used for the study of other genomic events in the form of point events and events in segments, as well as with other microarray platforms than the MDGA which is utilized here. Simulated probe sets based on a window-based probe design mimicing the design of the MDGA are used to study the effect of various factors in probe design on the performance of test statistics. Insights are offered for determining key features in such design, such as probe intensity, when designing a new microarray platform, in order to achieve desired power for the purpose of mutation cluster detection

    BAYESIAN NONPARAMETRIC CROSS-STUDY VALIDATION OF PREDICTION METHODS

    Full text link
    We consider comparisons of statistical learning algorithms using multiple data sets, via leave-one-in cross-study validation: each of the algorithms is trained on one data set; the resulting model is then validated on each remaining data set. This poses two statistical challenges that need to be addressed simultaneously. The first is the assessment of study heterogeneity, with the aim of identifying a subset of studies within which algorithm comparisons can be reliably carried out. The second is the comparison of algorithms using the ensemble of data sets. We address both problems by integrating clustering and model comparison. We formulate a Bayesian model for the array of cross-study validation statistics, which defines clusters of studies with similar properties and provides the basis for meaningful algorithm comparison in the presence of study heterogeneity. We illustrate our approach through simulations involving studies with varying severity of systematic errors, and in the context of medical prognosis for patients diagnosed with cancer, using high-throughput measurements of the transcriptional activity of the tumor’s genes

    Experimenting with sequential allocation procedures

    Get PDF
    In experiments that consider the use of subjects, a crucial part is deciding which treatment to allocate to which subject – in other words, constructing the treatment allocation procedure. In a classical experiment, this treatment allocation procedure often simply constitutes randomly assigning subjects to a number of different treatments. Subsequently, when all outcomes have been observed, the resulting data is used to conduct an analysis that is specified a priori. Practically, however, the subjects often arrive at an experiment one-by-one. This allows the data generating process to be viewed differently: instead of considering the subjects in a batch, intermediate data from previous interactions with other subjects can be used to influence the decisions of the treatment allocation in future interactions. A heavily researched formalization that helps developing strategies for sequentially allocating subjects is the multi-armed bandit problem. In this thesis, several methods are developed to expedite the use of sequential allocation procedures by (social) scientists in field experiments. This is done by building upon the extensive literature of the multi-armed bandit problem. The thesis also introduces and shows many (empirical) examples of the usefulness and applicability of sequential allocation procedures in practice
    • …
    corecore