27 research outputs found
Variational Bayes with Intractable Likelihood
Variational Bayes (VB) is rapidly becoming a popular tool for Bayesian
inference in statistical modeling. However, the existing VB algorithms are
restricted to cases where the likelihood is tractable, which precludes the use
of VB in many interesting situations such as in state space models and in
approximate Bayesian computation (ABC), where application of VB methods was
previously impossible. This paper extends the scope of application of VB to
cases where the likelihood is intractable, but can be estimated unbiasedly. The
proposed VB method therefore makes it possible to carry out Bayesian inference
in many statistical applications, including state space models and ABC. The
method is generic in the sense that it can be applied to almost all statistical
models without requiring too much model-based derivation, which is a drawback
of many existing VB algorithms. We also show how the proposed method can be
used to obtain highly accurate VB approximations of marginal posterior
distributions.Comment: 40 pages, 6 figure
Hierarchical hidden Markov structure for dynamic correlations: the hierarchical RSDC model.
This paper presents a new multivariate GARCH model with time-varying conditional correlation structure which is a generalization of the Regime Switching Dynamic Correlation (RSDC) of Pelletier (2006). This model, which we name Hierarchical RSDC, is building with the hierarchical generalization of the hidden Markov model introduced by Fine et al. (1998). This can be viewed graphically as a tree-structure with different types of states. The first are called production states and they can emit observations, as in the classical Markov-Switching approach. The second are called abstract states. They can't emit observations but establish vertical and horizontal probabilities that define the dynamic of the hidden hierarchical structure. The main gain of this approach compared to the classical Markov-Switching model is to increase the granularity of the regimes. Our model is also compared to the new Double Smooth Transition Conditional Correlation GARCH model (DSTCC), a STAR approach for dynamic correlations proposed by Silvennoinen and Teräsvirta (2007). The reason is that under certain assumptions, the DSTCC and our model represent two classical competing approaches to modeling regime switching. We also perform Monte-Carlo simulations and we apply the model to two empirical applications studying the conditional correlations of selected stock returns. Results show that the Hierarchical RSDC provides a good measure of the correlations and also has an interesting explanatory power.Multivariate GARCH; Dynamic correlations; Regime switching; Markov chain; Hidden Markov models; Hierarchical Hidden Markov models
On two mixture-based clustering approaches used in modeling an insurance portfolio
We review two complementary mixture-based clustering approaches for modeling unobserved heterogeneity in an insurance portfolio: the generalized linear mixed cluster-weighted model (CWM) and mixture-based clustering for an ordered stereotype model (OSM). The latter is
for modeling of ordinal variables, and the former is for modeling losses as a function of mixed-type of covariates. The article extends the idea of mixture modeling to a multivariate classification for the purpose of testing unobserved heterogeneity in an insurance portfolio. The application
of both methods is illustrated on a well-known French automobile portfolio, in which the model fitting is performed using the expectation-maximization (EM) algorithm. Our findings show that these mixture-based clustering methods can be used to further test unobserved heterogeneity in an insurance portfolio and as such may be considered in insurance pricing, underwriting, and risk management.Peer ReviewedPostprint (published version
Codominant scoring of AFLP in association panels
A study on the codominant scoring of AFLP markers in association panels without prior knowledge on genotype probabilities is described. Bands are scored codominantly by fitting normal mixture models to band intensities, illustrating and optimizing existing methodology, which employs the EM-algorithm. We study features that improve the performance of the algorithm, and the unmixing in general, like parameter initialization, restrictions on parameters, data transformation, and outlier removal. Parameter restrictions include equal component variances, equal or nearly equal distances between component means, and mixing probabilities according to Hardy–Weinberg Equilibrium. Histogram visualization of band intensities with superimposed normal densities, and optional classification scores and other grouping information, assists further in the codominant scoring. We find empirical evidence favoring the square root transformation of the band intensity, as was found in segregating populations. Our approach provides posterior genotype probabilities for marker loci. These probabilities can form the basis for association mapping and are more useful than the standard scoring categories A, H, B, C, D. They can also be used to calculate predictors for additive and dominance effects. Diagnostics for data quality of AFLP markers are described: preference for three-component mixture model, good separation between component means, and lack of singletons for the component with highest mean. Software has been developed in R, containing the models for normal mixtures with facilitating features, and visualizations. The methods are applied to an association panel in tomato, comprising 1,175 polymorphic markers on 94 tomato hybrids, as part of a larger study within the Dutch Centre for BioSystems Genomics
Dynamic longitudinal discriminant analysis using multiple longitudinal markers of different types
There is an emerging need in clinical research to accurately predict patients disease status and disease progression by optimally integrating multivariate clinical information. Clinical data is often collected over time for multiple biomarkers of different types (e.g. continuous, binary, counts). In this paper, we present a flexible and dynamic (time-dependent) discriminant analysis approach in which multiple biomarkers of various types are jointly modelled for classification purposes by the multivariate generalized linear mixed model. We propose a mixture of normal distributions for the random effects to allow additional flexibility when modelling the complex correlation between longitudinal biomarkers and to robustify the model and the classification procedure against misspecification of the random effects distribution. These longitudinal models are subsequently used in a multivariate time-dependent discriminant scheme to predict, at any time point, the probability of belonging to a particular risk group. The methodology is illustrated using clinical data from patients with epilepsy, where the aim is to identify patients who will not achieve remission of seizures within a 5-year follow up period
Row mixture-based clustering with covariates for ordinal responses
Existing methods can perform likelihood-based clustering on a multivariate data matrix of ordinal data, using finite mixtures to cluster the rows (observations) of the matrix. These models can incorporate the main effects of individual rows and columns, as well as cluster effects, to model the matrix of responses. However, many real-world applications also include available covariates, which provide insights into the main characteristics of the clusters and determine clustering structures based on both the individuals’ similar patterns of responses and the effects of the covariates on the individuals' responses. In our research we have extended the mixture-based models to include covariates and test what effect this has on the resulting clustering structures. We focus on clustering the rows of the data matrix, using the proportional odds cumulative logit model for ordinal data. We fit the models using the Expectation-Maximization algorithm and assess performance using a simulation study. We also illustrate an application of the models to the well-known arthritis clinical trial data set"This work has been supported by the Ministerio de Ciencia e Innovación (Spain) [PID2019-104830RB-I00/ DOI (AEI): 10.13039/501100011033], and by Grant 2021 SGR 01421 (GRBIO) administrated by the Departament de Recerca i Universitats de la Generalitat de Catalunya (Spain). Daniel Fernández is member of the Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III (CIBERSAM). Daniel Fernández is a Serra Húnter Fellow"Peer ReviewedPostprint (published version
Extending Mixture of Experts Model to Investigate Heterogeneity of Trajectories: When, Where and How to Add Which Covariates
Researchers are usually interested in examining the impact of covariates when
separating heterogeneous samples into latent classes that are more homogeneous.
The majority of theoretical and empirical studies with such aims have focused
on identifying covariates as predictors of class membership in the structural
equation modeling framework. In other words, the covariates only indirectly
affect the sample heterogeneity. However, the covariates' influence on
between-individual differences can also be direct. This article presents a
mixture model that investigates covariates to explain within-cluster and
between-cluster heterogeneity simultaneously, known as a mixture-of-experts
(MoE) model. This study aims to extend the MoE framework to investigate
heterogeneity in nonlinear trajectories: to identify latent classes, covariates
as predictors to clusters, and covariates that explain within-cluster
differences in change patterns over time. Our simulation studies demonstrate
that the proposed model generally estimates the parameters unbiasedly,
precisely and exhibits appropriate empirical coverage for a nominal 95%
confidence interval. This study also proposes implementing structural equation
model forests to shrink the covariate space of the proposed mixture model. We
illustrate how to select covariates and construct the proposed model with
longitudinal mathematics achievement data. Additionally, we demonstrate that
the proposed mixture model can be further extended in the structural equation
modeling framework by allowing the covariates that have direct effects to be
time-varying.Comment: Draft version 1.7, 06/01/2021. This paper has not been peer reviewed.
Please do not copy or cite without author's permissio