27 research outputs found

    Variational Bayes with Intractable Likelihood

    Full text link
    Variational Bayes (VB) is rapidly becoming a popular tool for Bayesian inference in statistical modeling. However, the existing VB algorithms are restricted to cases where the likelihood is tractable, which precludes the use of VB in many interesting situations such as in state space models and in approximate Bayesian computation (ABC), where application of VB methods was previously impossible. This paper extends the scope of application of VB to cases where the likelihood is intractable, but can be estimated unbiasedly. The proposed VB method therefore makes it possible to carry out Bayesian inference in many statistical applications, including state space models and ABC. The method is generic in the sense that it can be applied to almost all statistical models without requiring too much model-based derivation, which is a drawback of many existing VB algorithms. We also show how the proposed method can be used to obtain highly accurate VB approximations of marginal posterior distributions.Comment: 40 pages, 6 figure

    Hierarchical hidden Markov structure for dynamic correlations: the hierarchical RSDC model.

    Get PDF
    This paper presents a new multivariate GARCH model with time-varying conditional correlation structure which is a generalization of the Regime Switching Dynamic Correlation (RSDC) of Pelletier (2006). This model, which we name Hierarchical RSDC, is building with the hierarchical generalization of the hidden Markov model introduced by Fine et al. (1998). This can be viewed graphically as a tree-structure with different types of states. The first are called production states and they can emit observations, as in the classical Markov-Switching approach. The second are called abstract states. They can't emit observations but establish vertical and horizontal probabilities that define the dynamic of the hidden hierarchical structure. The main gain of this approach compared to the classical Markov-Switching model is to increase the granularity of the regimes. Our model is also compared to the new Double Smooth Transition Conditional Correlation GARCH model (DSTCC), a STAR approach for dynamic correlations proposed by Silvennoinen and Teräsvirta (2007). The reason is that under certain assumptions, the DSTCC and our model represent two classical competing approaches to modeling regime switching. We also perform Monte-Carlo simulations and we apply the model to two empirical applications studying the conditional correlations of selected stock returns. Results show that the Hierarchical RSDC provides a good measure of the correlations and also has an interesting explanatory power.Multivariate GARCH; Dynamic correlations; Regime switching; Markov chain; Hidden Markov models; Hierarchical Hidden Markov models

    On two mixture-based clustering approaches used in modeling an insurance portfolio

    Get PDF
    We review two complementary mixture-based clustering approaches for modeling unobserved heterogeneity in an insurance portfolio: the generalized linear mixed cluster-weighted model (CWM) and mixture-based clustering for an ordered stereotype model (OSM). The latter is for modeling of ordinal variables, and the former is for modeling losses as a function of mixed-type of covariates. The article extends the idea of mixture modeling to a multivariate classification for the purpose of testing unobserved heterogeneity in an insurance portfolio. The application of both methods is illustrated on a well-known French automobile portfolio, in which the model fitting is performed using the expectation-maximization (EM) algorithm. Our findings show that these mixture-based clustering methods can be used to further test unobserved heterogeneity in an insurance portfolio and as such may be considered in insurance pricing, underwriting, and risk management.Peer ReviewedPostprint (published version

    Codominant scoring of AFLP in association panels

    Get PDF
    A study on the codominant scoring of AFLP markers in association panels without prior knowledge on genotype probabilities is described. Bands are scored codominantly by fitting normal mixture models to band intensities, illustrating and optimizing existing methodology, which employs the EM-algorithm. We study features that improve the performance of the algorithm, and the unmixing in general, like parameter initialization, restrictions on parameters, data transformation, and outlier removal. Parameter restrictions include equal component variances, equal or nearly equal distances between component means, and mixing probabilities according to Hardy–Weinberg Equilibrium. Histogram visualization of band intensities with superimposed normal densities, and optional classification scores and other grouping information, assists further in the codominant scoring. We find empirical evidence favoring the square root transformation of the band intensity, as was found in segregating populations. Our approach provides posterior genotype probabilities for marker loci. These probabilities can form the basis for association mapping and are more useful than the standard scoring categories A, H, B, C, D. They can also be used to calculate predictors for additive and dominance effects. Diagnostics for data quality of AFLP markers are described: preference for three-component mixture model, good separation between component means, and lack of singletons for the component with highest mean. Software has been developed in R, containing the models for normal mixtures with facilitating features, and visualizations. The methods are applied to an association panel in tomato, comprising 1,175 polymorphic markers on 94 tomato hybrids, as part of a larger study within the Dutch Centre for BioSystems Genomics

    Dynamic longitudinal discriminant analysis using multiple longitudinal markers of different types

    Get PDF
    There is an emerging need in clinical research to accurately predict patients disease status and disease progression by optimally integrating multivariate clinical information. Clinical data is often collected over time for multiple biomarkers of different types (e.g. continuous, binary, counts). In this paper, we present a flexible and dynamic (time-dependent) discriminant analysis approach in which multiple biomarkers of various types are jointly modelled for classification purposes by the multivariate generalized linear mixed model. We propose a mixture of normal distributions for the random effects to allow additional flexibility when modelling the complex correlation between longitudinal biomarkers and to robustify the model and the classification procedure against misspecification of the random effects distribution. These longitudinal models are subsequently used in a multivariate time-dependent discriminant scheme to predict, at any time point, the probability of belonging to a particular risk group. The methodology is illustrated using clinical data from patients with epilepsy, where the aim is to identify patients who will not achieve remission of seizures within a 5-year follow up period

    Row mixture-based clustering with covariates for ordinal responses

    Get PDF
    Existing methods can perform likelihood-based clustering on a multivariate data matrix of ordinal data, using finite mixtures to cluster the rows (observations) of the matrix. These models can incorporate the main effects of individual rows and columns, as well as cluster effects, to model the matrix of responses. However, many real-world applications also include available covariates, which provide insights into the main characteristics of the clusters and determine clustering structures based on both the individuals’ similar patterns of responses and the effects of the covariates on the individuals' responses. In our research we have extended the mixture-based models to include covariates and test what effect this has on the resulting clustering structures. We focus on clustering the rows of the data matrix, using the proportional odds cumulative logit model for ordinal data. We fit the models using the Expectation-Maximization algorithm and assess performance using a simulation study. We also illustrate an application of the models to the well-known arthritis clinical trial data set"This work has been supported by the Ministerio de Ciencia e Innovación (Spain) [PID2019-104830RB-I00/ DOI (AEI): 10.13039/501100011033], and by Grant 2021 SGR 01421 (GRBIO) administrated by the Departament de Recerca i Universitats de la Generalitat de Catalunya (Spain). Daniel Fernández is member of the Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III (CIBERSAM). Daniel Fernández is a Serra Húnter Fellow"Peer ReviewedPostprint (published version

    Extending Mixture of Experts Model to Investigate Heterogeneity of Trajectories: When, Where and How to Add Which Covariates

    Full text link
    Researchers are usually interested in examining the impact of covariates when separating heterogeneous samples into latent classes that are more homogeneous. The majority of theoretical and empirical studies with such aims have focused on identifying covariates as predictors of class membership in the structural equation modeling framework. In other words, the covariates only indirectly affect the sample heterogeneity. However, the covariates' influence on between-individual differences can also be direct. This article presents a mixture model that investigates covariates to explain within-cluster and between-cluster heterogeneity simultaneously, known as a mixture-of-experts (MoE) model. This study aims to extend the MoE framework to investigate heterogeneity in nonlinear trajectories: to identify latent classes, covariates as predictors to clusters, and covariates that explain within-cluster differences in change patterns over time. Our simulation studies demonstrate that the proposed model generally estimates the parameters unbiasedly, precisely and exhibits appropriate empirical coverage for a nominal 95% confidence interval. This study also proposes implementing structural equation model forests to shrink the covariate space of the proposed mixture model. We illustrate how to select covariates and construct the proposed model with longitudinal mathematics achievement data. Additionally, we demonstrate that the proposed mixture model can be further extended in the structural equation modeling framework by allowing the covariates that have direct effects to be time-varying.Comment: Draft version 1.7, 06/01/2021. This paper has not been peer reviewed. Please do not copy or cite without author's permissio
    corecore