119,392 research outputs found

    Multivariate phase type distributions - Applications and parameter estimation

    Get PDF
    Den bedst kendte univariate sandsynlighedsfordeling er normalfordelingen. Den er grundigt beskrevet i litteraturen inden for et bredt felt af anvendelsesområder. I de tilfælde, hvor det ikke er meningsfuldt at anvende normalfordelingen, findes alternative sandsynlighedsfordelinger som alle er godt beskrevet; mange af disse tilhører klassen af fasetypefordelinger. Fasetypefordelinger har adskillige fordele. De er alsidige forstået på den måde, at de kan benyttes til at tilnærme en vilkårlig sandsynlighedsfordeling defineret på den positive reelle akse. Der eksisterer generelle probabilistiske resultater for hele klassen af fasetypefordelinger, hvilket bidrager til anvendelsen af forskellige estimeringsmetoder på enten klassen af fasetypefordelinger eller dens delklasser. Disse egenskaber gør klassen af fasetypefordelinger til et interessant alternativ til normalfordelingen.Når det kommer til multivariate problemer, så er den multivariate normalfordeling den eneste generelle fordeling, der tillader parameterestimering og statistisk inferens. Desværre er kendskabet til egenskaberne af den multivariate fasetypefordeling stærk begrænset. Resultaterne for parameterestimering og inferensteori for den univariate fasetypefordeling indikerer et potentiale for lignende gode resultater for klassen af multivariate fasetypefordelinger. Mit ph.d.-studium var en del afWork Package 3 i UNITE-projektet. UNITEprojektet arbejder mod det overordnede mål at forbedre kvaliteten af beslutningsgrundlaget for projekter. Dette gøres ved at reducere systematisk model bias og ved at beskrive og reducere model usikkerheder generelt. Forskning har vist, at afvigelsen fra omkostningsestimater for infrastrukturprojekter tydeligvis ikke er normaltfordelt men i stedet hælder mod budgetoverskridelser. Denne skævhed kan beskrives med fasetypefordelinger. Cost-benefit-analyser bruges til at evaluere potentielle fremtidige projekter og til at udvikle pålidelige omkostningsvurderinger. Successiv Princippet er en gruppebaseret analysemetode, der primært bruges til at prædiktere omkostninger og varighed af mellem til store projekter. Vi mener, at den matematiske modellering, der ligger til grund for Successiv Princippet, kan forbedres. Vi foreslår derfor en ny tilgang til modellering af den samlede varighed af et projekt ved hjælp af univariate fasetypefordelinger. Den matematiske model er dernæst udvidet til også at beskrive korrelationen mellem projektvarighed og omkostninger nu baseret på bivariate fasetypefordelinger. Vores model kan anvendes til at forbedre estimater for varighed og omkostninger, og derved hjælpe projekters beslutningstagere til at træffe en optimal beslutning.Det arbejde, jeg har udført som en del af mit ph.d.-studium, sigtede efter at belyse klassen af multivariate fasetypefordelinger. Denne afhandling indeholder analytiske og numeriske resultater for parameterestimering og inferensteori for en gruppe af multivariate fasetypefordelinger. Resultaterne kan betragtes som et første skridt i retning af en mere tilbundsgående forståelse af multivariate fasetypefordelinger. Vi er imidlertid langt fra at have afdækket det fulde potentiale af generelle fasetypefordelinger. En dybere forståelse af multivariate fasetypefordelinger vil åbne op for et bredt felt af anvendelsesområder.Afhandlingen består af en opsummerende rapport og to videnskabelige artikler. Det bagvedliggende arbejde var udført i perioden 2010 til 2014.The best known univariate probability distribution is the normal distribution. It is used throughout the literature in a broad field of applications. In cases where it is not sensible to use the normal distribution alternative distributions are at hand and well understood, many of these belonging to the class of phase type distributions. Phase type distributions have several advantages. They are versatile in the sense that they can be used to approximate any given probability distribution on the positive reals. There exist general probabilistic results for the entire class of phase type distributions, allowing for different estimation methods for the whole class or subclasses of phase type distributions. These attributes make this class of distributions an interesting alternative to the normal distribution. When facing multivariate problems, the only general distribution that allows for estimation and statistical inference, is the multivariate normal distribution. Unfortunately only little is known about the general class of multivariate phase type distribution. Considering the results concerning parameter estimation and inference theory of univariate phase type distributions, the class of multivariate phase type distributions shows potential for similar great results.My PhD studies were part of the the work package 3 of the UNITE project. The overall goal of the UNITE project is to improve the decision support prior to deciding on a project by reducing systematic model bias and by quantifying and reducing model uncertainties.Research has shown that the errors on cost estimates for infrastructure projects clearly do not follow a normal distribution but is skewed towards cost overruns. This skewness can be described using phase type distributions. Cost benefit analysis assesses potential future projects and depend on reliable cost estimates. The Successive Principle is a group analysis method primarily used for analyzing medium to large projects in relation to cost or duration. We believe that the mathematical modeling used in the Successive Principle can be improved. We suggested a novel approach for modeling the total duration of a project using a univariate phase type distribution. The model is then extended to catch the correlation between duration and cost estimates using a bivariate phase type distribution. The use of our model can improve estimates for duration and costs and therefore help project management to make the optimal decisions. The work conducted during my PhD studies aimed at shedding light on the class of multivariate phase type distributions. This thesis contains analytical and numerical results for parameter estimations and inference theory for a family of multivariate phase type distributions. The results can be used as a stepping stone towards understanding multivariate phase type distributions better. However, we are far from uncovering the full potential of general multivariate phase type distributions. Deeper understanding of multivariate phase type distributions will open up a broad field of research areas they can be applied to.This thesis consists of a summary report and two research papers. The work was carried out in the period 2010 - 2014

    Ruin problems for risk processes with dependent phase-type claims

    Full text link
    We consider continuous time risk processes in which the claim sizes are dependent and non-identically distributed phase-type distributions. The class of distributions we propose is easy to characterize and allows to incorporate the dependence between claims in a simple and intuitive way. It is also designed to facilitate the study of the risk processes by using a Markov-modulated fluid embedding technique. Using this technique, we obtain simple recursive procedures to determine the joint distribution of the time of ruin, the deficit at ruin and the number of claims before the ruin. We also obtain some bounds for the ultimate ruin probability. Finally, we provide a few examples of multivariate phase-type distributions and use them for numerical illustration

    Multivariate phase-type theory for the site frequency spectrum

    Full text link
    Linear functions of the site frequency spectrum (SFS) play a major role for understanding and investigating genetic diversity. Estimators of the mutation rate (e.g. based on the total number of segregating sites or average of the pairwise differences) and tests for neutrality (e.g. Tajima's D) are perhaps the most well-known examples. The distribution of linear functions of the SFS is important for constructing confidence intervals for the estimators, and to determine significance thresholds for neutrality tests. These distributions are often approximated using simulation procedures. In this paper we use multivariate phase-type theory to specify, characterize and calculate the distribution of linear functions of the site frequency spectrum. In particular, we show that many of the classical estimators of the mutation rate are distributed according to a discrete phase-type distribution. Neutrality tests, however, are generally not discrete phase-type distributed. For neutrality tests we derive the probability generating function using continuous multivariate phase-type theory, and numerically invert the function to obtain the distribution. A main result is an analytically tractable formula for the probability generating function of the SFS. Software implementation of the phase-type methodology is available in the R package phasty, and R code for the reproduction of our results is available as an accompanying vignette

    Asymptotic Distributions of Largest Pearson Correlation Coefficients under Dependent Structures

    Full text link
    Given a random sample from a multivariate normal distribution whose covariance matrix is a Toeplitz matrix, we study the largest off-diagonal entry of the sample correlation matrix. Assuming the multivariate normal distribution has the covariance structure of an auto-regressive sequence, we establish a phase transition in the limiting distribution of the largest off-diagonal entry. We show that the limiting distributions are of Gumbel-type (with different parameters) depending on how large or small the parameter of the autoregressive sequence is. At the critical case, we obtain that the limiting distribution is the maximum of two independent random variables of Gumbel distributions. This phase transition establishes the exact threshold at which the auto-regressive covariance structure behaves differently than its counterpart with the covariance matrix equal to the identity. Assuming the covariance matrix is a general Toeplitz matrix, we obtain the limiting distribution of the largest entry under the ultra-high dimensional settings: it is a weighted sum of two independent random variables, one normal and the other following a Gumbel-type law. The counterpart of the non-Gaussian case is also discussed. As an application, we study a high-dimensional covariance testing problem

    Phase I and Phase II control charts for the variance and generalized variance

    Get PDF
    By extending the results of Human, Chakraborti and Smit (2010), Phase I control charts are derived for the generalized variance when the mean vector and covariance matrix of multivariate normally distributed data are unknown and estimated from m independent samples, each of size n. In Phase II predictive distributions based on a Bayesian approach are used to construct Shewart-type control limits for the variance and generalized variance. The posterior distribution is obtained by combining the likelihood (the observed data in Phase I) and the uncertainty of the unknown parameters via the prior distribution. By using the posterior distribution the unconditional predictive density functions are derived

    Actuarial applications of multivariate phase-type distributions : model calibration and credibility

    Full text link
    Thèse numérisée par la Division de la gestion de documents et des archives de l'Université de Montréal
    corecore