Three-way data structures, characterized by three entities, the units, the
variables and the occasions, are frequent in biological studies. In RNA
sequencing, three-way data structures are obtained when high-throughput
transcriptome sequencing data are collected for n genes across p conditions at
r occasions. Matrix-variate distributions offer a natural way to model
three-way data and mixtures of matrix-variate distributions can be used to
cluster three-way data. Clustering of gene expression data is carried out as
means to discovering gene co-expression networks. In this work, a mixture of
matrix-variate Poisson-log normal distributions is proposed for clustering read
counts from RNA sequencing. By considering the matrix-variate structure, full
information on the conditions and occasions of the RNA sequencing dataset is
simultaneously considered, and the number of covariance parameters to be
estimated is reduced. A Markov chain Monte Carlo expectation-maximization
algorithm is used for parameter estimation and information criteria are used
for model selection. The models are applied to both real and simulated data,
giving favourable clustering results