Finite mixtures of matrix-variate Poisson-log normal distributions for
  three-way count data

McNicholas, Paul D.; Rothstein, Steven J.; Silva, Anjali; Subedi, Sanjeena

research

Finite mixtures of matrix-variate Poisson-log normal distributions for three-way count data

Authors: Paul D. McNicholas
Steven J. Rothstein
Anjali Silva
Sanjeena Subedi
Publication date: 22 July 2018
Publisher

Abstract

Three-way data structures, characterized by three entities, the units, the variables and the occasions, are frequent in biological studies. In RNA sequencing, three-way data structures are obtained when high-throughput transcriptome sequencing data are collected for n genes across p conditions at r occasions. Matrix-variate distributions offer a natural way to model three-way data and mixtures of matrix-variate distributions can be used to cluster three-way data. Clustering of gene expression data is carried out as means to discovering gene co-expression networks. In this work, a mixture of matrix-variate Poisson-log normal distributions is proposed for clustering read counts from RNA sequencing. By considering the matrix-variate structure, full information on the conditions and occasions of the RNA sequencing dataset is simultaneously considered, and the number of covariance parameters to be estimated is reduced. A Markov chain Monte Carlo expectation-maximization algorithm is used for parameter estimation and information criteria are used for model selection. The models are applied to both real and simulated data, giving favourable clustering results

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:1807.08380

Last time updated on 08/09/2018