14 research outputs found

    Bayesian Nonparametric Modelling of the Return Distribution with Stochastic Volatility

    Get PDF
    This paper presents a method for Bayesian nonparametric analysis of the return distribution in a stochastic volatility model. The distribution of the logarithm of the squared return is flexibly modelled using an infinite mixture of Normal distributions. This allows efficient Markov chain Monte Carlo methods to be developed. Links between the return distribution and the distribution of the logarithm of the squared returns are discussed. The method is applied to simulated data, one asset return series and one stock index return series. We find that estimates of volatility using the model can differ dramatically from those using a Normal return distribution if there is evidence of a heavy-tailed return distribution

    Bayesian nonparametric modelling of financial data

    No full text
    This thesis presents a class of discrete time univariate stochastic volatility models using Bayesian nonparametric techniques. In particular, the models that will be introduced are not only the basic stochastic volatility model, but also the heavy-tailed model using scale mixture of Normals and the leverage model. The aim will be focused on capturing flexibly the distribution of the logarithm of the squared return under the aforementioned models using infinite mixture of Normals. Parameter estimates for these models will be obtained using Markov chain Monte Carlo methods and the Kalman filter. Links between the return distribution and the distribution of the logarithm of the squared returns "fill be established. The one-step ahead predictive ability of the model will be measured using log-predictive scores. Asset returns, stock indices and exchange rates will be fitted using the developed methods.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    A Bayesian algorithm for detecting differentially expressed proteins and its application in breast cancer research

    No full text
    Presence of considerable noise and missing data points make analysis of mass-spectrometry (MS) based proteomic data a challenging task. The missing values in MS data are caused by the inability of MS machines to reliably detect proteins whose abundances fall below the detection limit. We developed a Bayesian algorithm that exploits this knowledge and uses missing data points as a complementary source of information to the observed protein intensities in order to find differentially expressed proteins by analysing MS based proteomic data. We compared its accuracy with many other methods using several simulated datasets. It consistently outperformed other methods. We then used it to analyse proteomic screens of a breast cancer (BC) patient cohort. It revealed large differences between the proteomic landscapes of triple negative and Luminal A, which are the most and least aggressive types of BC. Unexpectedly, majority of these differences could be attributed to the direct transcriptional activity of only seven transcription factors some of which are known to be inactive in triple negative BC. We also identified two new proteins which significantly correlated with the survival of BC patients, and therefore may have potential diagnostic/prognostic values.European Commission - Seventh Framework Programme (FP7)Science Foundation IrelandThe Irish Cancer Societ

    A Bayesian algorithm for detecting differentially expressed proteins and its application in breast cancer research

    No full text
    Presence of considerable noise and missing data points make analysis of mass-spectrometry (MS) based proteomic data a challenging task. The missing values in MS data are caused by the inability of MS machines to reliably detect proteins whose abundances fall below the detection limit. We developed a Bayesian algorithm that exploits this knowledge and uses missing data points as a complementary source of information to the observed protein intensities in order to find differentially expressed proteins by analysing MS based proteomic data. We compared its accuracy with many other methods using several simulated datasets. It consistently outperformed other methods. We then used it to analyse proteomic screens of a breast cancer (BC) patient cohort. It revealed large differences between the proteomic landscapes of triple negative and Luminal A, which are the most and least aggressive types of BC. Unexpectedly, majority of these differences could be attributed to the direct transcriptional activity of only seven transcription factors some of which are known to be inactive in triple negative BC. We also identified two new proteins which significantly correlated with the survival of BC patients, and therefore may have potential diagnostic/prognostic values.European Commission - Seventh Framework Programme (FP7)Science Foundation IrelandThe Irish Cancer Societ

    A Bayesian semiparametric model for volatility with a leverage effect

    No full text
    A Bayesian semiparametric stochastic volatility model for financial data is developed. This nonparametrically estimates the return distribution from the data allowing for stylized facts such as heavy tails of the distribution of returns whilst also allowing for correlation between the returns and changes in volatility, which is usually termed the leverage effect. An efficient MCMC algorithm is described for inference. The model is applied to simulated data and two real data sets. The results of fitting the model to these data show that choosing a parametric return distribution can have a substantial effect on inference about the leverage effect

    Comparison of different statistical approaches for urinary peptide biomarker detection in the context of coronary artery disease

    No full text
    Background: When combined with a clinical outcome variable, the size, complexity and nature of mass-spectrometry proteomics data impose great statistical challenges in the discovery of potential disease-associated biomarkers. The purpose of this study was thus to evaluate the effectiveness of different statistical methods applied for urinary proteomic biomarker discovery and different methods of classifier modelling in respect of the diagnosis of coronary artery disease in 197 study subjects and the prognostication of acute coronary syndromes in 368 study subjects. Results: Computing the discovery sub-cohorts comprising 2=3 of the study subjects based on the Wilcoxon rank sum test, t-score, cat-score, binary discriminant analysis and random forests provided largely different numbers (ranging from 2 to 398) of potential peptide biomarkers. Moreover, these biomarker patterns showed very little overlap limited to fragments of type I and III collagens as the common denominator. However, these differences in biomarker patterns did mostly not translate into significant differently performing diagnostic or prognostic classifiers modelled by support vector machine, diagonal discriminant analysis, linear discriminant analysis, binary discriminant analysis and random forest. This was even true when different biomarker patterns were combined into master-patterns. Conclusion: In conclusion, our study revealed a very considerable dependence of peptide biomarker discovery on statistical computing of urinary peptide profiles while the observed diagnostic and/or prognostic reliability of classifiers was widely independent of the modelling approach. This may however be due to the limited statistical power in classifier testing. Nonetheless, our study showed that urinary proteome analysis has the potential to provide valuable biomarkers for coronary artery disease mirroring especially alterations in the extracellular matrix. It further showed that for a comprehensive discovery of biomarkers and thus of pathological information, the results of different statistical methods may best be combined into a master pattern that then can be used for classifier modelling.European Commission - Seventh Framework Programme (FP7

    Comparison of different statistical approaches for urinary peptide biomarker detection in the context of coronary artery disease

    No full text
    Background: When combined with a clinical outcome variable, the size, complexity and nature of mass-spectrometry proteomics data impose great statistical challenges in the discovery of potential disease-associated biomarkers. The purpose of this study was thus to evaluate the effectiveness of different statistical methods applied for urinary proteomic biomarker discovery and different methods of classifier modelling in respect of the diagnosis of coronary artery disease in 197 study subjects and the prognostication of acute coronary syndromes in 368 study subjects. Results: Computing the discovery sub-cohorts comprising 2=3 of the study subjects based on the Wilcoxon rank sum test, t-score, cat-score, binary discriminant analysis and random forests provided largely different numbers (ranging from 2 to 398) of potential peptide biomarkers. Moreover, these biomarker patterns showed very little overlap limited to fragments of type I and III collagens as the common denominator. However, these differences in biomarker patterns did mostly not translate into significant differently performing diagnostic or prognostic classifiers modelled by support vector machine, diagonal discriminant analysis, linear discriminant analysis, binary discriminant analysis and random forest. This was even true when different biomarker patterns were combined into master-patterns. Conclusion: In conclusion, our study revealed a very considerable dependence of peptide biomarker discovery on statistical computing of urinary peptide profiles while the observed diagnostic and/or prognostic reliability of classifiers was widely independent of the modelling approach. This may however be due to the limited statistical power in classifier testing. Nonetheless, our study showed that urinary proteome analysis has the potential to provide valuable biomarkers for coronary artery disease mirroring especially alterations in the extracellular matrix. It further showed that for a comprehensive discovery of biomarkers and thus of pathological information, the results of different statistical methods may best be combined into a master pattern that then can be used for classifier modelling.European Commission - Seventh Framework Programme (FP7
    corecore