Search CORE

22 research outputs found

Finite mixtures of matrix-variate Poisson-log normal distributions for three-way count data

Author: McNicholas Paul D.
Rothstein Steven J.
Silva Anjali
Subedi Sanjeena
Publication venue
Publication date: 22/07/2018
Field of study

Three-way data structures, characterized by three entities, the units, the variables and the occasions, are frequent in biological studies. In RNA sequencing, three-way data structures are obtained when high-throughput transcriptome sequencing data are collected for n genes across p conditions at r occasions. Matrix-variate distributions offer a natural way to model three-way data and mixtures of matrix-variate distributions can be used to cluster three-way data. Clustering of gene expression data is carried out as means to discovering gene co-expression networks. In this work, a mixture of matrix-variate Poisson-log normal distributions is proposed for clustering read counts from RNA sequencing. By considering the matrix-variate structure, full information on the conditions and occasions of the RNA sequencing dataset is simultaneously considered, and the number of covariance parameters to be estimated is reduced. A Markov chain Monte Carlo expectation-maximization algorithm is used for parameter estimation and information criteria are used for model selection. The models are applied to both real and simulated data, giving favourable clustering results

arXiv.org e-Print Archive

Penalized model-based clustering for three-way data structures

Author: Alessandro Casa
Andrea Cappozzo
Michael Fop
Publication venue: Pearson
Publication date: 01/01/2021
Field of study

Recently, there has been an increasing interest in developing statistical methods able to find groups in matrix-valued data. To this extent, matrix Gaussian mixture models (MGMM) provide a natural extension to the popular model-based clustering based on Normal mixtures. Unfortunately, the overparametrization issue, already affecting the vector-variate framework, is further exacerbated when it comes to MGMM, since the number of parameters scales quadratically with both row and column dimensions. In order to overcome this limitation, the present paper introduces a sparse model-based clustering approach for three-way data structures. By means of penalized estimation, our methodology shrinks the estimates towards zero, achieving more stable and parsimonious clustering in high dimensional scenarios. An application to satellite images underlines the benefits of the proposed method

Archivio istituzionale della ricerca - Politecnico di Milano

Covariance pattern mixture models for the analysis of multivariate heterogeneous longitudinal data

Author: Anderlucci Laura
Viroli Cinzia
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2015
Field of study

We propose a novel approach for modeling multivariate longitudinal data in the presence of unobserved heterogeneity for the analysis of the Health and Retirement Study (HRS) data. Our proposal can be cast within the framework of linear mixed models with discrete individual random intercepts; however, differently from the standard formulation, the proposed Covariance Pattern Mixture Model (CPMM) does not require the usual local independence assumption. The model is thus able to simultaneously model the heterogeneity, the association among the responses and the temporal dependence structure. We focus on the investigation of temporal patterns related to the cognitive functioning in retired American respondents. In particular, we aim to understand whether it can be affected by some individual socio-economical characteristics and whether it is possible to identify some homogenous groups of respondents that share a similar cognitive profile. An accurate description of the detected groups allows government policy interventions to be opportunely addressed. Results identify three homogenous clusters of individuals with specific cognitive functioning, consistent with the class conditional distribution of the covariates. The flexibility of CPMM allows for a different contribution of each regressor on the responses according to group membership. In so doing, the identified groups receive a global and accurate phenomenological characterization.Comment: Published at http://dx.doi.org/10.1214/15-AOAS816 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Copula-based fuzzy clustering of spatial time series

Author: Alonso
Athanasopoulos
Basford
Birant
Bárdossy
Caiado
Caiado
Caiado
Caiado
Campello
Coppi
Coppi
Coppi
Coppi
De Luca
De Luca
Di Lascio
Di Lascio
Durante
Durante
Durante
Durante
Durante
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
Ester
Everitt
Fouedjio
Garcia-Escudero
Genest
Grabisch
Guthke
Handl
Hu
Hubert
Hwang
Hyndman
Hüllermeier
Ienco
Izakian
James
Joe
Kamdar
Kaufman
Kazianka
Klement
Krishnapuram
Lafuente-Rego
Maharaj
Maharaj
Maharaj
Montes
Nelsen
Otranto
Patton
Piccolo
Rand
Shekhar
Torabi
Torabi
Vilar
Viroli
Wang
Wang
Warren Liao
Wedel
Xie
Xie
Yager
Yang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

This paper contributes to the existing literature on the analysis of spatial time series presenting a new clustering algorithm called COFUST, i.e. COpula-based FUzzy clustering algorithm for Spatial Time series. The underlying idea of this algorithm is to perform a fuzzy Partitioning Around Medoids (PAM) clustering using copula-based approach to interpret comovements of time series. This generalisation allows both to extend usual clustering methods for time series based on Pearson’s correlation and to capture the uncertainty that arises assigning units to clusters. Furthermore, its flexibility permits to include directly in the algorithm the spatial information. Our approach is presented and discussed using both simulated and real data, highlighting its main advantages

Crossref

Bournemouth University Research Online

Archivio della ricerca- Università di Roma La Sapienza

Archivio Istituzionale della Ricerca- Università del Salento

Archivio istituzionale della ricerca - Università di Padova